WO2008064577A1 - A method and an apparatus for adjusting quantization quality in encoder and decoder - Google Patents

A method and an apparatus for adjusting quantization quality in encoder and decoder Download PDF

Info

Publication number
WO2008064577A1
WO2008064577A1 PCT/CN2007/003799 CN2007003799W WO2008064577A1 WO 2008064577 A1 WO2008064577 A1 WO 2008064577A1 CN 2007003799 W CN2007003799 W CN 2007003799W WO 2008064577 A1 WO2008064577 A1 WO 2008064577A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
scaling factor
unit
shaping
factor
Prior art date
Application number
PCT/CN2007/003799
Other languages
French (fr)
Chinese (zh)
Other versions
WO2008064577A8 (en
Inventor
Wei Li
Lijing Xu
Qing Zhang
Jianfeng Xu
Shenghu Sang
Zhengzhong Du
Yao Zou
Peilin Liu
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to EP07855801A priority Critical patent/EP2104095A4/en
Publication of WO2008064577A1 publication Critical patent/WO2008064577A1/en
Publication of WO2008064577A8 publication Critical patent/WO2008064577A8/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components

Definitions

  • the present invention relates to coding techniques, and more particularly to a method and apparatus for adjusting quantization quality in codecs. Background technique
  • technologies that can satisfy low bit rate and high quality audio coding mainly include: AAC+, EAAC+ and AMR-WB+.
  • AAC+ and EAAC+ are extended from high-rate audio encoders
  • AMR-WB+ is a hybrid coding method formed by extending low-rate speech coding.
  • the sampled values are generally time-frequency transformed, and then the spectral coefficients are weighted and quantized according to the auditory characteristics, and the quantized spectral coefficients are then passed through the entropy. Value encoding transmission.
  • the main distortion in the coding results from the quantification of various parameters. Therefore, in order to adapt to different needs, the encoder needs to adjust the quality of the quantization according to the specified code rate: in a high bit rate coding scheme such as greater than 24 kbps, a good encoder will reach a transparent sound shield, that is, a human ear.
  • the noise introduced in the coding quantization process cannot be detected.
  • the low code rate coding scheme due to the shortage of the number of bits, it is impossible to completely achieve the effect of sound quality transparency, and thus only the subjective distortion as small as possible can be pursued.
  • a commonly used technique for adjusting the quantization shield is to use a scaling factor or gain.
  • the encoded coefficients are first divided by the scaling factor or multiplied by the gain, and then the scaled coefficients are quantized.
  • the most suitable scaling factor satisfies the code rate. The requirements can make the quantization error as small as possible. Therefore, when the code rate is relatively high, a smaller scaling factor is selected, so that the dynamic range of the quantized coefficient is relatively large, and the quantization is relatively fine; when the code rate is relatively small, the larger one is selected.
  • FIG. 1 shows a schematic block diagram of the MPEG1-LAYER3 audio coding algorithm.
  • the entire coding frequency band is equally divided into 32 sub-bands, each of which is assigned a scaling factor, and a global scaling factor is assigned to the entire frequency band;
  • the closed-loop search algorithm adjusts the global scaling factor so that the number of quantization bits is within the allowable range of the current bit rate, while adjusting the scaling factor within the sub-band, so that the quantization noise is below the masking domain of the human ear as much as possible, that is, the human ear does not feel the quantization noise.
  • the existence of the quantized coefficient stream is finally transmitted by Huffman coding.
  • the sub-band multi-scaling factor coding method in the MPEG1-LAYER3 coding algorithm has the following defects:
  • Subband division requires 32 subband analysis filter banks, and the computational complexity is high;
  • FIG. 2 shows the flow chart of the Transform Excitation Coding (TCX) section of the AMR-WB+ audio coding algorithm.
  • TCX Transform Excitation Coding
  • the so-called important frequency band refers to the low frequency band.
  • the amplification factor calculated in the frequency pre-shaping is not transmitted in the encoded code stream, but in the spectral inverse shaping, according to the frequency pre-shaping method, each frequency domain sample is calculated.
  • the recovered frequency domain samples are obtained by dividing the frequency domain samples of each block by the amplification factor of the corresponding block.
  • the inventors found that the global scaling factor algorithm of the existing AMR-WB+ audio coding algorithm TCX part has at least the following defects:
  • Embodiments of the present invention provide a method for adjusting quantization quality in coding, which reduces implementation complexity.
  • Embodiments of the present invention provide a method for adjusting a quantization shield amount in decoding, which can ensure quantization quality.
  • Embodiments of the present invention provide an apparatus for adjusting quantization quality in encoding, which reduces implementation complexity.
  • Embodiments of the present invention provide an apparatus for adjusting quantization quality in decoding, which can ensure quantization quality.
  • An embodiment of the present invention provides a method for adjusting quantization quality in coding, where the method includes: using two or more scaling factors to adjust a first sampling value for encoding, and then adjusting the first sampling The value is quantized to obtain a quantized sample value; the influence of the scaling factor is removed from the obtained quantized sample value to obtain a second sample value, and the first sample is utilized The value and the second sample value result in a global gain; the obtained quantized sample value, the information of the two or more scaling factors, and the resulting global gain are output as an encoded stream.
  • An embodiment of the present invention provides a method for adjusting a quantization quality in decoding, and decoding an encoded stream output by an encoding end to obtain a decoded stream, where the method includes: acquiring a quantized sample value, two or more scaling factors from the decoded stream Information and global gain; using the information of two or more scaling factors, removing the effect of the scaling factor from the quantized sample values to obtain the sampled value, multiplied by the global gain.
  • An embodiment of the present invention provides an apparatus for adjusting a quantization shield in coding, where the apparatus includes: a multi-scale factor control unit, a quantization unit, a gain balance unit, and a global gain calculation unit; wherein the multi-scale factor control unit is used by Receiving a first sample value, setting two or more scaling factors for the first sample value, adjusting the first sample value by using a scaling factor, and outputting the adjusted first sample value to the quantization unit; The unit is configured to quantize the received first sample value to obtain a quantized sample value and output the same to the gain balancing unit; the gain balancing unit is configured to receive the quantized sample value, and remove the influence of the scaling factor from the quantized sample value Obtaining a second sample value and outputting to the global gain calculation unit; the global gain calculation unit is configured to receive the first sample value and the second sample value, and obtain the global gain by using the first sample value and the second sample value.
  • An embodiment of the present invention provides an apparatus for adjusting a quantization quality in decoding, where the apparatus includes: a gain balancing unit and a global gain balancing unit; wherein the gain balancing unit is configured to receive a quantized sample value and a scaling factor, and utilize the Received scaling factor, removing the influence of the scaling factor from the quantized sample value to obtain a sampled value, and outputting the sampled value to the global gain balancing unit; the global gain balancing unit is configured to receive the global gain and the sampled value, and multiply the sampled value Output after global gain.
  • the method and apparatus for adjusting the quantization quality according to the embodiment of the present invention are different from the scheme of using the filter described in the prior art, and directly dividing the sampled value into a plurality of parts and respectively setting a scaling factor for each part, therefore, It can greatly reduce the implementation complexity; Moreover, unlike the prior art scheme using a global scaling factor, since multiple scaling factors are used, the quantization quality of important parts can be better adjusted, and better coding can be obtained. effect.
  • FIG. 1 is a schematic block diagram of an MPEG1-LAYER3 audio coding algorithm in the prior art
  • Figure 2 is a flow chart showing the TCX portion of the AMR-WB+ audio coding algorithm in the prior art
  • FIG. 3 is a schematic block diagram of an encoder for adjusting quantization quality according to Embodiment 1 of the present invention
  • FIG. 4 is a schematic block diagram of a decoder for adjusting quantization quality according to Embodiment 1 of the present invention
  • FIG. a flow chart for adjusting the quantization quality by a multi-scaling factor at the encoding end;
  • FIG. 6 is a flowchart of selecting a plurality of scaling factors and fine-tuning frequency domain samples of an entire frequency band according to Embodiment 1 of the present invention
  • FIG. 7 is a flowchart of adjusting a quantized shield by a multi-scaling factor at a decoding end according to Embodiment 1 of the present invention.
  • FIG. 8 is a schematic block diagram of an encoder for adjusting a quantized shield according to Embodiment 2 of the present invention
  • FIG. 9 is a schematic block diagram of a decoder for adjusting quantization quality according to Embodiment 2 of the present invention
  • FIG. 2 is a schematic diagram of peak pre-shaping in FIG. 2
  • FIG. 11 is a schematic diagram of implementing peak inverse shaping in Embodiment 2 of the present invention
  • FIG. 12 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 3 of the present invention
  • FIG. 14 is a structural diagram of an apparatus for adjusting quantization quality at an encoding end according to Embodiment 4 of the present invention;
  • Figure 15 is a block diagram showing the arrangement of the apparatus for adjusting the quantization quality at the decoding end in the fourth embodiment of the present invention. detailed description
  • the main idea of adjusting the quantization quality provided by the embodiment of the present invention is to utilize multiple scaling factors.
  • the encoding process of time-frequency transforming the sampled values will be mainly described.
  • the embodiment of the present invention can still be applied to the case where the time-frequency transform is not performed on the sampled values during the encoding process.
  • Embodiment 1 provides a method of adjusting a quantized shield by a multi-scaling factor.
  • FIG. 3 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 1.
  • time domain sample values are first converted into a frequency domain by time-frequency transform, and then quantized by a multi-scaling factor, quantized and output quantized.
  • the sampled value, the output quantized sample value is calculated by gain balance and inverse time-frequency transform to calculate the optimal global gain.
  • the coded stream needs to transmit the scaling factor, the quantized value of the frequency domain sampled value, and the global gain.
  • FIG. 4 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 1, in which a quantized frequency domain sample value is subjected to gain balance and inverse time-frequency transform to obtain a time domain sample value, and finally multiplied by a global gain. The time domain sampled values can be restored.
  • Step 501 Convert the time domain sample value to the frequency domain sample value X(k) by time-frequency transform.
  • time-frequency transform such as discrete Fourier transform (DFT), discrete cosine transform (DCT, MDCT, IDCT), and wavelet transform (DWT) may be employed.
  • DFT discrete Fourier transform
  • DCT discrete cosine transform
  • DWT wavelet transform
  • FFT fast Fourier transform
  • P strives for low computational complexity.
  • Step 502 Perform multi-scaling control on the frequency domain sample values, specifically, selecting a suitable multiple scaling factors to fine-tune the frequency domain sample values of the entire frequency band.
  • Step 601 Divide the entire frequency band into m parts , get m parts of the frequency domain sample value ⁇ -, ⁇ , ⁇ +l, « m _ 1 +2,- , N),-, ( « 1 +l, 2, + 2,-,n 2 ) , and ⁇ !
  • the scaling factor for each part is represented by gl , &, ... ⁇ .
  • multiple scaling factors are directly divided on the entire frequency band after time-frequency transform, and it is not necessary to first divide the frequency band into several segments through the filter group, and then set a scaling factor in each segment, thereby Compared with the prior art, the implementation complexity can be greatly reduced.
  • Step 602 Select a reference value for estimating m scaling factors, the selection of the reference value of the scaling factor such that the number of consumed bits is 0 .
  • the estimated value is less than the maximum allowable number of bits.
  • Step 603 At g . Adjust m scaling factors nearby.
  • the m scaling factors can be adjusted by reducing the scaling factor of the more important frequency bands and increasing the scaling factor of the unimportant frequency band.
  • the more important frequency band refers to the low frequency band
  • the unimportant frequency band refers to the high frequency band. Since & ⁇ corresponds to the low to high frequency bands respectively, the adjusted m scaling factors are gradually increasing relationships. Through this adjustment, the quantization quality of the more important frequency bands can be relatively high, and the quantization quality of the unimportant frequency bands is relatively low, so that the quantization quality in the entire frequency band is optimized.
  • Step 604 Determine that the estimated number of consumed bits does not exceed the total number of bits under the adjusted m scaling factors. If not, return to step 603 to adjust the scaling factor again. If yes, the number of consumed bits will be satisfied.
  • the m scaling factors are represented as step 605: Calculating the quantized perceptual distortion based on the adjusted m scaling factors ⁇ , g m .
  • the value indicates: the original frequency domain sample value X and the difference between the sample values obtained by adjusting the frequency domain sample value X by m scaling factors gl , g 2 , -, g m
  • step 605 according to the adjusted m scaling factors g! , g 2 ,... , g m calculated the quantitative perceptual distortion as c
  • Step 606 Determine whether the quantized perceptual distortion is within an unperceivable range. If yes, the m scaling factors obtained after the current adjustment are used as the optimal scaling factor, and gi opt , g 2op ,, ", g mop A Then, step 607 is performed; otherwise, step 603 is returned.
  • the perceptual distortion is within the range that cannot be perceived, the person cannot perceive the quantization noise introduced by the encoder.
  • the specific insensible range is a specific range of values that allow distortion.
  • a specific method for determining whether the quantized perceptual distortion is in an unperceivable range is: determining whether the value of the quantized perceptual distortion calculated in step 605 is within a range of the allowable distortion, and if so, the quantized perceptual distortion is not perceived. Otherwise, quantitative perception is considered to be perceptible.
  • step 606 when the quantized perceptual distortion can be perceived, if the quantized perceptual distortion can still be perceived after repeating the above-mentioned adjustment step M times, the closed loop selection is ended, and the repeated process is repeated from the above process.
  • a set of scaling factors that minimize the perceptual distortion is selected as the optimal scaling factor, and then step 607 is performed.
  • the number of closed-loop selections M can be determined according to actual conditions.
  • Step 607 Fine-tuning the frequency domain sample value X by using the obtained m optimal scaling factors g , g , that is, dividing the frequency domain sample value of each block by the optimal scaling factor of the corresponding block, and obtaining the fine-tuned spectrum.
  • the concrete expression is as follows.
  • Slopt Smopt sends the fine-tuned frequency-domain sampled values obtained in steps 601 to 607 above to the encoder.
  • the scaling factor is required to recover the data during decoding, the scaling factor needs to be transmitted in the encoded code stream.
  • the way to transfer the scaling factor can be done in a variety of ways, as described below.
  • Mode 1 for transmitting the scaling factor m scaling used to fine tune the frequency sampled value
  • the factors ⁇ ..., ⁇ are all encoded, so that the data can be recovered more accurately when decoding.
  • Mode 2 of transmitting the scaling factor m scaling factors g ⁇ , g 2 f., g m when used to fine tune the frequency sampled value.
  • P select a scaling factor as the reference scaling factor, then calculate the ratio of the remaining m - 1 scaling factors to the reference scaling factor, and encode the m - 1 ratio. For example, as a benchmark scaling factor, only coding is required. In this way, the number of bits consumed can be reduced.
  • Mode 3 for transmitting the scaling factor m scaling factors used to fine tune the frequency sampled value Medium, selecting a scaling factor as a reference scaling factor, then calculating a ratio of the remaining m-1 scaling factors to the reference scaling factor, and encoding the reference scaling factor and m-1 ratios. For example, put gl . p' as the reference scaling factor, you need to encode and ⁇ L, ,..., 3 ⁇ 4L. In this way, not only can the consumed bits be reduced
  • the number of preferred scaling factors can be selected according to the requirements of the coding rate and the quality of the quantization. For example, in low bit rate coding, 2 to 3 scaling factors can be selected.
  • Step 503 Quantize the frequency domain sample value obtained by the multi-scaling factor control, and output the quantized frequency domain sample value 9 .
  • step 503 different quantization methods may be used according to the coding requirements, for example, multi-level vector quantization, split vector quantization, tree quantization, lattice vector quantization, and the like.
  • Step 504 The quantized frequency sample value obtained in step 503 is removed, and the original frequency domain sample value is restored; ⁇ ⁇ , that is, the quantized frequency sample value is obtained; and the gain balance is obtained to obtain ⁇ ⁇ .
  • the method of gain balancing also uses different methods.
  • the gain balancing can be performed by using multiple scaling factors selected in step 502, ⁇ ..., ⁇ , specifically: the quantized frequency sampling value is also followed by steps.
  • the frequency band division method in 601 is divided into m parts, and
  • X balance [Slop, ' X g ( ,l,- , ⁇ ,), -g 2opt X q («, + 1, «, + 2, ⁇ ⁇ ⁇ , « 2 ), ⁇ ⁇ ⁇ , g mopt ⁇ X q + 1, N)] If the method of transmitting the scaling factor is the above method 3, the gain balance can be performed by using the scaling values of the plurality of scaling factors, specifically: the quantized frequency sampling value is also followed by steps.
  • the frequency band division method in 601 is divided into m parts, and A ⁇ 1 '"''""), W 1, U2, -., N), ⁇ ( «, +1 ⁇ +2, -, « 2 ) are obtained.
  • Multiplying the frequency sample value of the corresponding part of the reference scaling factor by 1, and the remaining part of the quantization frequency sample value is multiplied by the ratio of the scaling factor of the corresponding part to the reference scaling factor, assuming the first part of the corresponding scaling factor g
  • the specific expression of the gain balance is as follows:
  • Step 505 Perform inverse time-frequency transform on the J ⁇ fl/ obtained after the gain balance, and convert the restored frequency domain sample value into the restored time domain sample value ⁇ 9 («).
  • Step 506 Calculate the optimal global gain g by using the original time domain sample value and the restored time domain sample value ( «).
  • the optimal global gain g gpi minimizes ; [; c( «)-g g -x q (n)] 2 . This gives the best global gain as -: ggopt -
  • the best global gain g g ⁇ ?3 ⁇ 4 also requires coded transmission for data recovery at the decoder.
  • the above is the process of adjusting the quantized shield by the multi-scaling factor at the encoding end.
  • the decoding end needs to recover the time domain sampling value according to the quantized frequency sampling value obtained after decoding by the flow shown in FIG. 7, and the specific process includes the following steps:
  • Step 701 Perform gain balance on the quantized frequency sample value by using a scaling factor obtained from the encoded stream.
  • the specific implementation is the same as the method described in step 504, and the description thereof is omitted here. It should be noted that the method of gain balancing is also different according to the way of transmitting the scaling factor, and the gain balancing mode in the encoding end and the gain balancing mode in the decoding end are also consistent.
  • Step 702 Perform inverse time-frequency transform on the frequency domain sample value obtained after the gain balance, and obtain a time domain sample value.
  • Step 703 The time domain sample value is multiplied by the global gain obtained from the encoded stream to obtain a recovered time domain sample value.
  • the multi-scaling factor control technique used in the first embodiment can directly perform the sampling value in the time domain, that is, it can be applied to the case where there is no time-frequency transform, and correspondingly, when calculating the global gain, there is no inverse time-frequency transform process.
  • Embodiment 2 provides a method of adjusting the quantized shield by multi-scaling factors and spectral shaping.
  • FIG. 8 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 2.
  • time domain sample values are first converted into a frequency domain by time-frequency transform, and then controlled by spectrum pre-shaping and multi-scaling factors.
  • the quantized sample values are quantized and output, and the output quantized sample values are calculated by gain balance, spectral inverse shaping, and inverse time-frequency transform to calculate an optimal global gain.
  • the coded stream needs to transmit the scaling factor, the quantized value of the frequency domain sampled value, and the global gain.
  • FIG. 9 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 2, in decoding
  • the quantized frequency domain sampled values are obtained by gain balancing, spectral inverse shaping, and inverse time-frequency transform to obtain time domain sampled values, and finally multiplied by the global gain to restore the time domain sampled values.
  • the specific steps of adjusting the quantization quality by the multi-scaling factor and the peak shaping are, based on the flow shown in FIG. 5 in Embodiment 1, the time-frequency transform and the step 502 described in the step 501.
  • the step of spectral pre-shaping further includes the step of spectrum inverse shaping between the gain balancing described in step 504 and the inverse time-frequency transform described in step 505.
  • the specific implementation methods of frequency pre-shaping and frequency-language inverse shaping are introduced in detail.
  • FIG. 10 shows a schematic diagram of spectrum pre-shaping, which can be implemented by the following steps.
  • Step 1001 Step of determining a spectrum shaping area and performing the spectrum shaping area
  • the frequency shaping region refers to the spectral region of the more important frequency band.
  • the spectrum shaping area can use the front part of the full frequency band, for example, the first quarter can be used.
  • the peak value may be defined as a local maximum value in the amplitude of the shaped spectrum segment, if > X(j), V; € [ - ⁇ , / + ⁇ ], / ⁇ j , then [ - ⁇ , + ⁇ ] The local maximum of 2 ⁇ + 1 point, where the local area can be arbitrarily selected.
  • Step 1002 Calculate a reference value p ref for spectrum pre-shaping.
  • the principle of selecting the reference value is to ensure that the reference value remains unchanged before and after spectral shaping.
  • the characteristic parameter of a piece of data can also be used as the reference value ⁇ to avoid the quantization error having a large influence on the reference value.
  • the peak energy of the low frequency portion is amplified so that the peak can be captured by the quantizer. Therefore, in the second embodiment, only a small number of spectral points are The peak is amplified.
  • the spectrum pre-shaping technique may also be referred to as peak pre-shaping. With this peak pre-shaping technique, the increase of the global gain is less affected, and the increase of the quantization error caused by the increase of the global gain is negligible.
  • you consider the effect of spectrum shaping better you can also zoom in on the spectral points around the peak. For example, if you zoom in on the local peak of 2 ⁇ +1 point, you can also 2 ⁇ around the peak or A point less than 2 ⁇ is amplified by the corresponding amplification factor.
  • the peak value of the frequency domain sample value at the important frequency band is increased, thereby reducing the quantization error at the smaller peak of the frequency domain sample value of the important frequency band, and reducing the frequency peak value of the more important frequency band.
  • the probability of loss in quantization is increased.
  • the spectrum shaping area and the peak labeling criterion in the spectrum inverse shaping process should be the same as those in the frequency pre-shaping process.
  • Step 1102 Calculate a reference value for spectrum inverse shaping, where ⁇ , spectrum inverse
  • spectrum inverse
  • the parameters in the process are consistent.
  • the reduction factor is calculated in the frequency pan inverse shaping process, and it is not necessary to transmit the reference value for spectral inverse shaping in the encoded stream, and the decoding end can also utilize the characteristics of the sampled value of the decoding end according to the above principle. Calculate the reference value for spectral inverse shaping, and further calculate the reduction factor of the corresponding peak, without taking up extra bits.
  • Step 1104 The peak value is reduced by using the calculated peak reduction factor.
  • step 505 After performing spectrum inverse shaping by the above steps, in step 505, the frequency domain sampled values obtained after inverse frequency shaping are inverse-time-transformed.
  • the spectrum inverse plasticizing between the gain balance and the inverse time-frequency transform is also required at the decoding end.
  • the specific implementation method is the same as the frequency inverse processing method performed in the above encoding process, and the description thereof is omitted here.
  • the frequency pre-shaping is performed first, and then the multi-scaling factor is controlled.
  • multi-scaling factor control may be performed first, and then spectrum pre-shaping is performed.
  • the first process may be performed first. The spectrum is inversely shaped and then gain balanced. In this case, no detailed introduction will be made.
  • Embodiment 3 provides a method of adjusting quantization quality by spectral shaping.
  • FIG. 12 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 3.
  • time domain sample values are first converted into a frequency domain by time-frequency transform, and then quantized by spectrum pre-shaping, and quantized.
  • the sampled value, the output quantized sample value is calculated by the frequency inverse inverse transform and the inverse time-frequency transform to calculate the optimal global gain.
  • the coded stream needs to transmit the quantized value of the frequency domain sampled value and the global gain three parts.
  • FIG. 13 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 3.
  • the quantized frequency domain sample values are obtained by frequency inverse transform and inverse time-frequency transform to obtain time domain sample values, and finally multiplied by global values. Gain restores the time domain sample value.
  • the method of frequency pre-shaping and spectrum inverse shaping is consistent with the implementation method and the obtained technical effects in Embodiment 2, and will not be described in detail herein.
  • Embodiment 4 gives an implementation device for adjusting the quantization quality.
  • FIG. 14 is a block diagram showing the configuration of the apparatus for adjusting the quantization quality at the encoding end in Embodiment 4.
  • the apparatus for adjusting the quantization quality at the encoding end includes: a time-frequency transform unit, a frequency pre-shaping unit, and a multi-scaling factor control list.
  • the time-frequency transform unit receives the first sampled value, and performs time-frequency transform on the first sampled value, and outputs the result to the spectrum pre-shaping unit.
  • the spectrum pre-shaping unit receives the output of the time-frequency transform unit.
  • the multi-scale factor control unit receives the first sample value, and sets two or two on the first sample value And more than one scaling factor, adjusting the first sampling value by using a scaling factor, and outputting the adjusted first sampling value to the quantization unit;
  • the quantization unit quantizing the received first sampling value to obtain a quantized sampling value and Outputting to the gain balancing unit;
  • the gain balancing unit receives the quantized sample value, removes the influence of the scaling factor from the quantized sample value to obtain a second sampled value, and outputs the same to the frequency inverse inverse shaping unit;
  • the unit receives the second sample value output by the gain balancing unit, performs spectral inverse shaping on the second sample value, and outputs the result to the inverse time-frequency transform unit;
  • the inverse time-frequency transform unit receives the second sampled value from the peak inverse shaping unit, and performs inverse time-frequency transform on the second sampled value, and outputs the same to
  • the multi-scale factor control unit includes: a scaling factor setting unit and a sample value adjusting unit; the scaling factor setting unit is configured to set two or more scaling factors for the first sampling value, and output the set scaling factor And the sample value adjustment unit is configured to receive a scaling factor, and adjust the first sample value by using a scaling factor.
  • the scaling factor setting unit includes: a reference value setting unit, a scaling factor adjusting unit, a consumption bit number estimating unit, and a perceptual distortion calculating unit;
  • the reference value setting unit is configured to set a reference value of the scaling factor, and output the scaling value to the scaling a factor adjustment unit;
  • the scale factor adjustment unit is configured to adjust a scaling factor according to a reference value, and output the result to the consumption bit number estimation unit and the perceptual distortion calculation unit;
  • the consumption bit number estimation unit is configured to estimate consumption according to a scaling factor The number of bits, and determining whether the number of consumed bits is smaller than the total number of bits allowed by the encoding, and transmitting the determination result to the scaling factor adjusting unit;
  • the perceptual distortion calculating unit is configured to calculate the perceptual distortion according to the scaling factor, and determine the perceptual distortion Whether the result of the determination is sent to the scaling factor adjustment unit within a range that is not perceptible.
  • the frequency pre-shaping unit includes: a peak marking unit, a reference value calculating unit, an amplification factor calculating unit, and a pre-shaping unit; wherein the peak marking unit is configured to receive the first sampling value and is in the spectrum shaping area a sample value, which is output to the reference value calculation unit; the reference value calculation unit is configured to calculate a reference value for frequency pre-shaping using a peak value, and output the result to the amplification factor calculation unit; The factor calculation unit is configured to calculate, by using the reference value, an amplification factor of each flag peak, and output the signal to the pre-shaping unit; the pre-shaping unit is configured to pre-shape the spectrum by using the amplification factor.
  • the frequency inverse transforming unit includes: a peak labeling unit, a reference value calculating unit, a reduction factor calculating unit, and an inverse shaping unit; wherein the peak labeling unit is configured to receive the sampling value and is in the sampling value in the spectrum shaping area. Marking a peak value, which is output to the reference value calculation unit; the reference value calculation unit is configured to calculate a reference value for frequency inverse transformation using a peak value, and output the result to the reduction factor calculation unit; The reduction factor of each marker peak is calculated by using the reference value, and is output to the inverse shaping unit.
  • the inverse shaping unit is configured to perform inverse shaping on the frequency using the reduction factor.
  • FIG. 15 is a block diagram showing the structure of the apparatus for adjusting the quantization quality at the decoding end in the fourth embodiment.
  • the apparatus for adjusting the quantization quality at the decoding end includes: a gain balancing unit, a spectrum inverse shaping unit, an inverse time-frequency transform unit, and a global gain balancing unit.
  • the gain balancing unit is configured to receive the quantized sample value and the scaling factor, and use the received scaling factor to remove the influence of the scaling factor from the quantized sample value to obtain a sampled value, and output the sampled value to the spectral inverse shaping unit;
  • the inverse frequency shaping unit receives the sampled value output by the gain balancing unit, performs spectral inverse shaping on the sampled value, and outputs the sampled value to the inverse time-frequency transform unit; the inverse time-frequency transform unit inversely shapes the spectrum from the spectrum
  • the sampling value is received in the unit, and the sampled value is inverse-time-converted and output to the global gain balancing unit; the global gain balancing unit receives the global gain and the sampled value, and multiplies the sampled value by the global gain and outputs the sampled value.
  • the global gain balancing unit can be a multiplier.
  • the spectrum inverse inverse unit of the decoding end is the same as the encoding end, and includes: a peak mark a unit, a reference value calculation unit, a reduction factor calculation unit, and an inverse shaping unit; wherein the peak marker unit receives the sample value, and marks a peak value in the sampled value in the spectrum shaping region, and outputs the peak value to the reference value calculation unit
  • the reference value calculation unit is configured to calculate a reference value for spectral inverse shaping using a peak value, and output the reference value to the reduction factor calculation unit;
  • the reduction factor calculation unit is configured to calculate a reduction factor of each marker peak value by using a reference value, And outputting to the inverse shaping unit;
  • the inverse shaping unit is configured to perform inverse shaping on the spectrum by using the reduction factor.
  • the embodiments described above can be applied to various coding fields such as audio coding, video coding, and image coding.
  • the present invention can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is a better implementation. the way.
  • the technical solution of the present invention may also be embodied in the form of a software product, which is stored in a storage medium, and includes a plurality of instructions for making A computer device (which may be a personal computer, server, or network device, etc.) performs the methods described in various embodiments of the present invention.
  • a computer device which may be a personal computer, server, or network device, etc.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A method for adjusting quantization quality in encoder. It comprises adjusting the first sample values to be encoded using two or more scale factors, and quantizing the first sample values adjusted to acquire quantized sample values. Then, getting rid of the influence of scale factors for the quantized sample values to get the second sample values, and based on the first sample values and the second sample values to get the global gain. Subsequently, encoding the quantized sample values, the two or more scale factors and the global gain into bit stream. Furthermore, the method for adjusting quantization quality in decoder and the apparatus for adjusting quantization quality in encoder and decoder can reduce the practical complication sharply, and adjust better quantization quality of important section, and acquire better encoding effect.

Description

技术领域 Technical field
本发明涉及编码技术,特别是指一种在编解码中调整量化质量的 方法和装置。 背景技术  The present invention relates to coding techniques, and more particularly to a method and apparatus for adjusting quantization quality in codecs. Background technique
随着通信技术的发展以及多媒体业务的扩展, 对于数字音频、视 频等编码, 不但需要更高的编码效率和实时性, 编码带宽也需要进一 步扩展。 针对数字音频编码来说, 目前, 能够满足低码率、 高质量音 频编码的技术主要包括: AAC+, EAAC+和 AMR-WB+。 其中 AAC+ 和 EAAC+是从高码率的音频编码器扩展而来, 而 AMR-WB+是通过 对低码率的语音编码扩展而形成的一种混合编码方式。  With the development of communication technologies and the expansion of multimedia services, for digital audio, video and other encoding, not only higher coding efficiency and real-time performance are required, but the coding bandwidth also needs to be further expanded. For digital audio coding, currently, technologies that can satisfy low bit rate and high quality audio coding mainly include: AAC+, EAAC+ and AMR-WB+. Among them, AAC+ and EAAC+ are extended from high-rate audio encoders, and AMR-WB+ is a hybrid coding method formed by extending low-rate speech coding.
在通常的音频编码中, 为了更好的结合人类听觉系统的一些特 性, 一般先对采样值作时频变换, 然后根据听觉特性对频谱系数进行 取舍加权并量化, 量化后的频谱系数再通过熵值编码传输。 编码中的 主要失真产生于对各种参数的量化。 因此, 为了适应不同的需求, 编 码器需根据指定码率的大小对量化的质量进行调整:在如大于 24kbps 的高码率编码方案中,好的编码器均会达到透明音盾, 即人耳无法察 觉到编码量化过程中引入的噪声; 而低码率编码方案中, 由于比特数 的不足, 不可能完全达到音质透明的效果,从而只能追求尽量小的主 观失真。  In the usual audio coding, in order to better combine some characteristics of the human auditory system, the sampled values are generally time-frequency transformed, and then the spectral coefficients are weighted and quantized according to the auditory characteristics, and the quantized spectral coefficients are then passed through the entropy. Value encoding transmission. The main distortion in the coding results from the quantification of various parameters. Therefore, in order to adapt to different needs, the encoder needs to adjust the quality of the quantization according to the specified code rate: in a high bit rate coding scheme such as greater than 24 kbps, a good encoder will reach a transparent sound shield, that is, a human ear. The noise introduced in the coding quantization process cannot be detected. In the low code rate coding scheme, due to the shortage of the number of bits, it is impossible to completely achieve the effect of sound quality transparency, and thus only the subjective distortion as small as possible can be pursued.
一种常用的调整量化盾量的技术是采用缩放因子或增益,编码的 系数先除以缩放因子或乘以增益, 然后再对缩放后的系数进行量化, 最合适的缩放因子既能满足码率的要求又能使量化误差尽量小。 因 此, 当码率比较高的时候, 选择较小的缩放因子, 这样量化系数的动 态范围相对较大, 量化相对精细; 而码率比较小的时候, 选择较大的  A commonly used technique for adjusting the quantization shield is to use a scaling factor or gain. The encoded coefficients are first divided by the scaling factor or multiplied by the gain, and then the scaled coefficients are quantized. The most suitable scaling factor satisfies the code rate. The requirements can make the quantization error as small as possible. Therefore, when the code rate is relatively high, a smaller scaling factor is selected, so that the dynamic range of the quantized coefficient is relatively large, and the quantization is relatively fine; when the code rate is relatively small, the larger one is selected.
确认本 缩放因子, 这样量化系数的动态范围相对较小, 量化相对粗糙。 Confirmation The scaling factor, such that the dynamic range of the quantized coefficients is relatively small, and the quantization is relatively coarse.
图 1 所示为 MPEG1-LAYER3 音频编码算法的示意框图。 在 MPEG1-LAYER3音频编码算法中, 在作时频变换之前, 将整个编码 频段等分为 32个子带, 对每个子带分配一个缩放因子, 对整个频带 分配一个全局缩放因子; 在量化之前, 通过闭环搜索算法调整全局缩 放因子,使得量化比特数在当前比特率允许范围内, 同时调整子带内 的缩放因子, 尽可能使量化噪声在人耳的掩蔽域以下, 即人耳感觉不 到量化噪声的存在; 最后, 量化后的系数流通过霍夫曼编码传输。  Figure 1 shows a schematic block diagram of the MPEG1-LAYER3 audio coding algorithm. In the MPEG1-LAYER3 audio coding algorithm, before the time-frequency transform, the entire coding frequency band is equally divided into 32 sub-bands, each of which is assigned a scaling factor, and a global scaling factor is assigned to the entire frequency band; The closed-loop search algorithm adjusts the global scaling factor so that the number of quantization bits is within the allowable range of the current bit rate, while adjusting the scaling factor within the sub-band, so that the quantization noise is below the masking domain of the human ear as much as possible, that is, the human ear does not feel the quantization noise. The existence of the quantized coefficient stream is finally transmitted by Huffman coding.
MPEG1-LAYER3编码算法中的子带多缩放因子编码方法存在下 列缺陷:  The sub-band multi-scaling factor coding method in the MPEG1-LAYER3 coding algorithm has the following defects:
( 1 )子带划分需要 32子带分析滤波器组, 计算复杂度很高; (1) Subband division requires 32 subband analysis filter banks, and the computational complexity is high;
( 2 )每个子带的缩放因子均需要量化编码传输, 占用的比特数 过多, 不适合低码率的编码需要。 (2) The scaling factor of each sub-band needs to be quantized and encoded, and the number of occupied bits is too large, which is not suitable for low-rate encoding.
图 2所示为在 AMR-WB+音频编码算法的变换激励编码(TCX ) 部分流程图。 在 AMR-WB+音频编码中, 采用一个全局缩放因子。 考 虑到采用一个缩放因子的局限性,无法针对某一特定的频率段进行微 调, 而且, 考虑到根据低码率的编码要求, 频谱中能量较小的频域样 值在矢量量化时会丢失,而由于人类听觉系统对不同频段的敏感程度 有差异, 编码时希望重要频段处的较小频域样值依然能够被量化, 所 以, 在 AMR-WB+音频编码中, 采用频谱预整形和频谱逆整形技术。 在 AMR-WB+音频编码算法的 TCX部分中, 首先对整个频谱中比较 重要的频段进行频语预整形,提升这些特定频段的能量, 然后再对全 频段采用同一个全局缩放因子。  Figure 2 shows the flow chart of the Transform Excitation Coding (TCX) section of the AMR-WB+ audio coding algorithm. In AMR-WB+ audio coding, a global scaling factor is used. Considering the limitation of using a scaling factor, it is impossible to fine-tune a specific frequency segment, and considering the coding requirements according to the low code rate, the frequency domain samples with less energy in the spectrum are lost in vector quantization. Since the sensitivity of the human auditory system to different frequency bands is different, it is expected that the smaller frequency domain samples at the important frequency bands can still be quantized during encoding. Therefore, in AMR-WB+ audio coding, spectrum pre-shaping and spectrum inverse shaping are used. technology. In the TCX part of the AMR-WB+ audio coding algorithm, frequency pre-shaping is performed on the more important frequency bands in the entire spectrum to increase the energy of these specific frequency bands, and then the same global scaling factor is used for the entire frequency band.
由于人类听觉系统在低频处有很高的频率分辨率,通常所说的重 要频段是指低频段。在 AMR-WB+音频编码中的频谱预整形中, 首先 对前四分之一频谱, 以每 8点频域样值作为一块, 计算每个分块的能 量 Em, 其中 m为分块索引号, 然后找出其中最大的分块能量£^, 并对每个分块计算出 =(£^/ E 再根据^得出每个分块的放大 因子 Gm , 使每个分块中放大因子6„)具有单调递减性, 最后对每个分 块的频域样值乘以相应块的放大因子。在 AMR-WB+音频编码中, 频 语预整形中计算出的放大因子不在编码码流中传输,而是在频谱逆整 形中,按照频语预整形中的方法,根据频域样值计算出每个分块的放 大因子 Gm后, 通过对每个分块的频域样值除以相应块的放大因子得 到恢复的频域样值。 Since the human auditory system has a high frequency resolution at low frequencies, the so-called important frequency band refers to the low frequency band. In spectrum pre-shaping in AMR-WB+ audio coding, first calculate the energy E m of each block for the first quarter spectrum, with each 8 point frequency domain sample as a block, where m is the block index number. , then find the largest block energy £^, and calculate =(£^/ E for each block and then obtain the amplification factor G m of each block according to ^, so that the amplification factor in each block 6„ ) has monotonous decreasing, and finally for each minute The frequency domain samples of the block are multiplied by the amplification factor of the corresponding block. In AMR-WB+ audio coding, the amplification factor calculated in the frequency pre-shaping is not transmitted in the encoded code stream, but in the spectral inverse shaping, according to the frequency pre-shaping method, each frequency domain sample is calculated. After the block's amplification factor G m , the recovered frequency domain samples are obtained by dividing the frequency domain samples of each block by the amplification factor of the corresponding block.
发明人在实现本发明的过程中,发现现有的 AMR-WB+音频编码 算法 TCX部分的全局缩放因子算法至少存在以下缺陷:  In the process of implementing the present invention, the inventors found that the global scaling factor algorithm of the existing AMR-WB+ audio coding algorithm TCX part has at least the following defects:
( 1 ) 由于对于全频带只使用一个缩放因子, 量化质量只能在整 个频带上调节, 无法强调某些比较重要的频率段;  (1) Since only one scaling factor is used for the full band, the quantization quality can only be adjusted over the entire frequency band, and some important frequency segments cannot be emphasized;
( 2 )尽管釆用频语预整形和频谱逆整形技术增强了低频处的量 化质量, 但牺牲了其余频带处的量化质量;  (2) Although the frequency pre-shaping and spectral inverse shaping techniques enhance the quantization quality at low frequencies, the quantization quality at the remaining frequency bands is sacrificed;
( 3 )频 i普预整形和逆整形技术只能应用到带宽较小的频段上, 否则将导致全局缩放因子的明显提升, 整体量化效果反而降低; (3) The frequency pre-shaping and inverse shaping techniques can only be applied to the frequency band with smaller bandwidth, otherwise the global scaling factor will be significantly improved, and the overall quantization effect will be reduced;
( 4 ) 由于在编码阶段预整形的放大因子并未记录到编码流中, 量化后产生的误差将在逆整形的缩小因子中产生误差累积效应。 发明内容 (4) Since the amplification factor pre-shaped in the encoding stage is not recorded in the encoded stream, the error generated after quantization will produce an error accumulation effect in the inverse shaping reduction factor. Summary of the invention
本发明实施例提供一种在编码中调整量化质量的方法,降低实现 复杂度。  Embodiments of the present invention provide a method for adjusting quantization quality in coding, which reduces implementation complexity.
本发明实施例提供一种在解码中调整量化盾量的方法, 能够保证 量化质量。  Embodiments of the present invention provide a method for adjusting a quantization shield amount in decoding, which can ensure quantization quality.
本发明实施例提供一种在编码中调整量化质量的装置, 降低实现 复杂度。  Embodiments of the present invention provide an apparatus for adjusting quantization quality in encoding, which reduces implementation complexity.
本发明实施例提供一种在解码中调整量化质量的装置, 能够保证 量化质量。  Embodiments of the present invention provide an apparatus for adjusting quantization quality in decoding, which can ensure quantization quality.
本发明实施例提供一种在编码中调整量化质量的方法,该方法包 括: 利用两个或两个以上缩放因子,对用于编码的第一采样值进行调 整后, 对调整后的第一采样值进行量化得到量化釆样值; 从所得到的 量化采样值中去除缩放因子的影响得到第二采样值,并利用第一采样 值和第二釆样值得到全局增益; 将所得到的量化采样值、所述两个或 两个以上的缩放因子的信息以及所得到的全局增益作为编码流输出。 An embodiment of the present invention provides a method for adjusting quantization quality in coding, where the method includes: using two or more scaling factors to adjust a first sampling value for encoding, and then adjusting the first sampling The value is quantized to obtain a quantized sample value; the influence of the scaling factor is removed from the obtained quantized sample value to obtain a second sample value, and the first sample is utilized The value and the second sample value result in a global gain; the obtained quantized sample value, the information of the two or more scaling factors, and the resulting global gain are output as an encoded stream.
本发明实施例提供一种在解码中调整量化质量的方法,对编码端 输出的编码流进行解码得到解码流, 该方法包括: 从解码流中获取量 化采样值、 两个或两个以上缩放因子的信息以及全局增益; 利用两个 或两个以上缩放因子的信息,从所述量化采样值中去除缩放因子的影 响得到采样值后, 乘以全局增益。  An embodiment of the present invention provides a method for adjusting a quantization quality in decoding, and decoding an encoded stream output by an encoding end to obtain a decoded stream, where the method includes: acquiring a quantized sample value, two or more scaling factors from the decoded stream Information and global gain; using the information of two or more scaling factors, removing the effect of the scaling factor from the quantized sample values to obtain the sampled value, multiplied by the global gain.
本发明实施例提供一种在编码中调整量化盾量的装置,该装置包 括: 多缩放因子控制单元, 量化单元, 增益平衡单元, 全局增益计算 单元; 其中, 所述多缩放因子控制单元用于接收第一采样值, 对第一 采样值设置两个或两个以上缩放因子,利用缩放因子对第一采样值进 行调整, 将调整后的第一采样值输出给所述量化单元; 所述量化单元 用于对所接收的第一釆样值进行量化得到量化采样值并输出给所述 增益平衡单元; 所述增益平衡单元用于接收量化釆样值, 从量化采样 值中去除缩放因子的影响得到第二釆样值,并输出给所述全局增益计 算单元; 全局增益计算单元用于接收第一采样值和第二采样值, 并利 用第一采样值和第二采样值得到全局增益。  An embodiment of the present invention provides an apparatus for adjusting a quantization shield in coding, where the apparatus includes: a multi-scale factor control unit, a quantization unit, a gain balance unit, and a global gain calculation unit; wherein the multi-scale factor control unit is used by Receiving a first sample value, setting two or more scaling factors for the first sample value, adjusting the first sample value by using a scaling factor, and outputting the adjusted first sample value to the quantization unit; The unit is configured to quantize the received first sample value to obtain a quantized sample value and output the same to the gain balancing unit; the gain balancing unit is configured to receive the quantized sample value, and remove the influence of the scaling factor from the quantized sample value Obtaining a second sample value and outputting to the global gain calculation unit; the global gain calculation unit is configured to receive the first sample value and the second sample value, and obtain the global gain by using the first sample value and the second sample value.
本发明实施例提供一种在解码中调整量化质量的装置,该装置包 括: 增益平衡单元和全局增益平衡单元; 其中, 所述增益平衡单元用 于接收量化釆样值和缩放因子, 并利用所接收的缩放因子, 从量化采 样值中去除缩放因子的影响得到采样值,并输出给所述全局增益平衡 单元; 所述全局增益平衡单元用于接收全局增益和采样值, 并对采样 值乘以全局增益后输出。  An embodiment of the present invention provides an apparatus for adjusting a quantization quality in decoding, where the apparatus includes: a gain balancing unit and a global gain balancing unit; wherein the gain balancing unit is configured to receive a quantized sample value and a scaling factor, and utilize the Received scaling factor, removing the influence of the scaling factor from the quantized sample value to obtain a sampled value, and outputting the sampled value to the global gain balancing unit; the global gain balancing unit is configured to receive the global gain and the sampled value, and multiply the sampled value Output after global gain.
根据本发明实施例提供的调整量化质量的方法和装置,与现有技 术中所述的使用滤波器的方案不同,直接对采样值划分为多个部分并 对各部分分别设置缩放因子, 因此,能够大大降低实现复杂度; 而且, 还与现有技术中使用一个全局缩放因子的方案不同,由于采用多个缩 放因子, 因此, 能够更好地调整重要部分的量化质量, 能够获得更好 的编码效果。 附图说明 The method and apparatus for adjusting the quantization quality according to the embodiment of the present invention are different from the scheme of using the filter described in the prior art, and directly dividing the sampled value into a plurality of parts and respectively setting a scaling factor for each part, therefore, It can greatly reduce the implementation complexity; Moreover, unlike the prior art scheme using a global scaling factor, since multiple scaling factors are used, the quantization quality of important parts can be better adjusted, and better coding can be obtained. effect. DRAWINGS
图 1所示为现有技术中 MPEG1-LAYER3音频编码算法的示意框 图;  1 is a schematic block diagram of an MPEG1-LAYER3 audio coding algorithm in the prior art;
图 2所示为现有技术中在 AMR-WB+音频编码算法的 TCX部分 流程图;  Figure 2 is a flow chart showing the TCX portion of the AMR-WB+ audio coding algorithm in the prior art;
图 3所示为本发明实施例 1中调整量化质量的编码器示意框图; 图 4所示为本发明实施例 1中调整量化质量的解码器示意框图; 图 5所示为本发明实施例 1中在编码端通过多缩放因子调整量化 质量的流程图;  FIG. 3 is a schematic block diagram of an encoder for adjusting quantization quality according to Embodiment 1 of the present invention; FIG. 4 is a schematic block diagram of a decoder for adjusting quantization quality according to Embodiment 1 of the present invention; FIG. a flow chart for adjusting the quantization quality by a multi-scaling factor at the encoding end;
图 6所示为本发明实施例 1中选择多个缩放因子并对整个频段的 频域样值进行微调的流程图;  6 is a flowchart of selecting a plurality of scaling factors and fine-tuning frequency domain samples of an entire frequency band according to Embodiment 1 of the present invention;
图 7所示为本发明实施例 1中在解码端通过多缩放因子调整量化 盾量的流程图;  7 is a flowchart of adjusting a quantized shield by a multi-scaling factor at a decoding end according to Embodiment 1 of the present invention;
图 8所示为本发明实施例 2中调整量化盾量的编码器示意框图; 图 9所示为本发明实施例 2中调整量化质量的解码器示意框图; 图 10所示为本发明实施例 2中实现峰值预整形的示意图; 图 11所示为本发明实施例 2中实现峰值逆整形的示意图; 图 12所示为本发明实施例 3中调整量化质量的编码器示意框图; 图 13所示为本发明实施例 3中调整量化质量的解码器示意框图; 图 14所示为本发明实施例 4中在编码端调整量化质量的装置结 构图;  8 is a schematic block diagram of an encoder for adjusting a quantized shield according to Embodiment 2 of the present invention; FIG. 9 is a schematic block diagram of a decoder for adjusting quantization quality according to Embodiment 2 of the present invention; FIG. 2 is a schematic diagram of peak pre-shaping in FIG. 2; FIG. 11 is a schematic diagram of implementing peak inverse shaping in Embodiment 2 of the present invention; FIG. 12 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 3 of the present invention; A schematic block diagram of a decoder for adjusting quantization quality in Embodiment 3 of the present invention; FIG. 14 is a structural diagram of an apparatus for adjusting quantization quality at an encoding end according to Embodiment 4 of the present invention;
图 15所示为本发明实施例 4中在解码端调整量化质量的装置结 构图。 具体实施方式  Figure 15 is a block diagram showing the arrangement of the apparatus for adjusting the quantization quality at the decoding end in the fourth embodiment of the present invention. detailed description
为使本发明的目的、技术方案和优点更加清楚明白, 下面举具体 实施例, 对本发明作进一步详细的说明。  In order to make the objects, technical solutions and advantages of the present invention more comprehensible, the present invention will be further described in detail.
本发明实施例提供的调整量化质量的主要思想是:利用多缩放因 子或者进一步利用频谱整形技术,调整编码过程中的量化质量。下面, 主要以对采样值进行时频变换的编码过程进行说明。 当然,对于在编 码过程中没有对采样值进行时频变换的情况,仍可以采用本发明实施 例。 The main idea of adjusting the quantization quality provided by the embodiment of the present invention is to utilize multiple scaling factors. The sub- or further use of spectral shaping techniques to adjust the quantization quality in the encoding process. In the following, the encoding process of time-frequency transforming the sampled values will be mainly described. Of course, the embodiment of the present invention can still be applied to the case where the time-frequency transform is not performed on the sampled values during the encoding process.
实施例 1  Example 1
实施例 1提供一种通过多缩放因子调整量化盾量的方法。  Embodiment 1 provides a method of adjusting a quantized shield by a multi-scaling factor.
图 3所示为实施例 1中调整量化质量的编码器示意框图,在编码 过程中, 时域采样值首先通过时频变换转换到频域, 然后通过多缩放 因子控制后, 进行量化并输出量化的采样值,输出的量化采样值通过 增益平衡、逆时频变换后计算最佳全局增益。 编码码流需要传输缩放 因子、 频域采样值的量化值以及全局增益三个部分。  FIG. 3 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 1. In the encoding process, time domain sample values are first converted into a frequency domain by time-frequency transform, and then quantized by a multi-scaling factor, quantized and output quantized. The sampled value, the output quantized sample value is calculated by gain balance and inverse time-frequency transform to calculate the optimal global gain. The coded stream needs to transmit the scaling factor, the quantized value of the frequency domain sampled value, and the global gain.
图 4所示为实施例 1中调整量化质量的解码器示意框图,在解码 过程中, 量化频域采样值通过增益平衡和逆时频变换后,得到时域釆 样值, 最后乘以全局增益即可还原时域采样值。  4 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 1, in which a quantized frequency domain sample value is subjected to gain balance and inverse time-frequency transform to obtain a time domain sample value, and finally multiplied by a global gain. The time domain sampled values can be restored.
下面给出在实施例 1中,在编码端通过多缩放因子调整量化质量 的具体步骤, 如图 5所示, 包括以下步骤:  The specific steps of adjusting the quantization quality by the multi-scaling factor at the encoding end in Embodiment 1 are given below. As shown in FIG. 5, the following steps are included:
步骤 501: 通过时频变换, 将时域采样值 转换到频域采样值 X(k) 。 在此, 可以采用离散傅立叶变换(DFT )、 离散余弦变换(DCT, MDCT, IDCT )、 小波变换(DWT )等时频变换。在时频变换过程中, 还可以采用快速傅立叶变换(FFT ), P争低计算复杂度。  Step 501: Convert the time domain sample value to the frequency domain sample value X(k) by time-frequency transform. Here, time-frequency transform such as discrete Fourier transform (DFT), discrete cosine transform (DCT, MDCT, IDCT), and wavelet transform (DWT) may be employed. In the time-frequency transform process, a fast Fourier transform (FFT) can also be used, and P strives for low computational complexity.
步骤 502: 对频域采样值 进行多缩放因子控制, 具体为, 选 择合适的多个缩放因子, 对整个频段的频域采样值进行微调。  Step 502: Perform multi-scaling control on the frequency domain sample values, specifically, selecting a suitable multiple scaling factors to fine-tune the frequency domain sample values of the entire frequency band.
本实施例中, 假设对整个频段的频域釆样值 ^ , = 0,1,· · ·,Λ采 用 m个缩放因子,并设在编码过程中,允许的比特数的最大值为6皿 。 下面, 结合图 6所示的流程图, 详细介绍选择合适的多缩放因子并对 频域采样值进行微调的步骤。 In this embodiment, it is assumed that the frequency domain samples of the entire frequency band are ^ , = 0,1, · · ·, and m scaling factors are used, and are set in the encoding process, and the maximum number of allowed bits is 6 . Next, in conjunction with the flowchart shown in FIG. 6, the steps of selecting an appropriate multi-scaling factor and fine-tuning the frequency domain sample values will be described in detail.
步 骤 601 : 将 整 个 频 段 划 分 为 m 个 部 分
Figure imgf000008_0001
, 得到 m 个部分的频域采样值 Χ^Χ-, η^,Χ^ +l,«m_1 +2,- , N),-, («1 +l, 2, + 2,-,n2 ) ,并^!寻每一部分 的缩放因子用 gl,&,…^表示。
Step 601: Divide the entire frequency band into m parts
Figure imgf000008_0001
, get m parts of the frequency domain sample value Χ^Χ-, η^,Χ^ +l,« m _ 1 +2,- , N),-, (« 1 +l, 2, + 2,-,n 2 ) , and ^! The scaling factor for each part is represented by gl , &, ...^.
本发明实施例中,多个缩放因子是对时频变换后的整个频带上直 接划分, 并不需要先通过滤波器组将频段划分为若干段, 再在每段内 设置一个缩放因子,从而与现有技术相比,能够大大降低实现复杂度。  In the embodiment of the present invention, multiple scaling factors are directly divided on the entire frequency band after time-frequency transform, and it is not necessary to first divide the frequency band into several segments through the filter group, and then set a scaling factor in each segment, thereby Compared with the prior art, the implementation complexity can be greatly reduced.
步骤 602: 选择用于估计 m个缩放因子的基准值 , 该缩放因子 的基准值 的选择,使得消耗比特数0。的估计值小于最大允许比特数 本实施例中, 消耗比特数 b的估计值是与频域采样值 X、频域采 样值的个数 N以及缩放因子 g相关的值, 可以用 6 = cons ( , N,g)的函 数表示。 因此, 在本步骤 602中, 选择缩放因子的基准值为 g。时, 消 耗比特数的估计值为 b0 = com(X, N,g0), 且满足 b < b Step 602: Select a reference value for estimating m scaling factors, the selection of the reference value of the scaling factor such that the number of consumed bits is 0 . The estimated value is less than the maximum allowable number of bits. In this embodiment, the estimated value of the consumed bit number b is a value related to the frequency domain sample value X, the number of frequency domain sample values N, and the scaling factor g, which can be 6 = cons ( , N, g) function representation. Therefore, in this step 602, the reference value of the scaling factor is selected to be g. The estimated number of bits consumed is b 0 = com(X, N, g 0 ), and satisfies b < b
步骤 603: 在 g。的附近调整 m个缩放因子 。 Step 603: At g . Adjust m scaling factors nearby.
本步骤 603中, 可以通过降低较重要频段的缩放因子,提升不重 要频段的缩放因子的方式, 调整 m个缩放因子。 在此, 较重要频段 是指低频段, 不重要频段是指高频段。 由于 & ~ 分别对应从低到高 的频段, 所以, 调整后的 m个缩放因子 是逐渐递增的关 系。 通过这种调整, 可以使较重要频段的量化质量相对较高, 不重要 频段的量化质量相对较低, 从而使整个频段内的量化质量达到最优。  In this step 603, the m scaling factors can be adjusted by reducing the scaling factor of the more important frequency bands and increasing the scaling factor of the unimportant frequency band. Here, the more important frequency band refers to the low frequency band, and the unimportant frequency band refers to the high frequency band. Since & ~ corresponds to the low to high frequency bands respectively, the adjusted m scaling factors are gradually increasing relationships. Through this adjustment, the quantization quality of the more important frequency bands can be relatively high, and the quantization quality of the unimportant frequency bands is relatively low, so that the quantization quality in the entire frequency band is optimized.
步骤 604: 判断在调整后的 m个缩放因子下, 消耗比特数的估计 值不超过总比特数, 如果不满足, 则返回步骤 603, 再次调整缩放因 子, 如果满足, 则将满足消耗比特数的 m 个缩放因子表示为 步骤 605: 根据调整后的 m个缩放因子^, gm, 计算量化感 知失真。 Step 604: Determine that the estimated number of consumed bits does not exceed the total number of bits under the adjusted m scaling factors. If not, return to step 603 to adjust the scaling factor again. If yes, the number of consumed bits will be satisfied. The m scaling factors are represented as step 605: Calculating the quantized perceptual distortion based on the adjusted m scaling factors ^, g m .
本实施例中, 量化感知失真 c是与频域采样值 X和 m个缩放因 子 相关的值, 可以用 = /( ,&,&,·..,^)的函数表示, 量化 感知失真 c的值表示: 原始的频域采样值 X和通过 m个缩放因子 gl,g2, -, gm对该频域采样值 X进行调整后得到的釆样值之间的差异所 带来的失真的值。 本步骤 605 中, 根据调整后的 m 个缩放因子 g!, g 2 ,… , g m计算得到的量化感知失真为 c
Figure imgf000010_0001
In this embodiment, the quantized perceptual distortion c is a value related to the frequency domain sample value X and m scaling factors, and can be represented by a function of = / ( , &, &, ·.., ^) to quantize the perceptual distortion c The value indicates: the original frequency domain sample value X and the difference between the sample values obtained by adjusting the frequency domain sample value X by m scaling factors gl , g 2 , -, g m The value of the distortion that is brought. In step 605, according to the adjusted m scaling factors g! , g 2 ,... , g m calculated the quantitative perceptual distortion as c
Figure imgf000010_0001
步骤 606:判断量化感知失真是否在无法感知的范围内,如果是, 则将本次调整后得到的 m 个缩放因子作为最佳缩放因子, 用 giopt,g2op,, "、gmopA示, 然后执行步骤 607; 否则, 返回步骤 603。 Step 606: Determine whether the quantized perceptual distortion is within an unperceivable range. If yes, the m scaling factors obtained after the current adjustment are used as the optimal scaling factor, and gi opt , g 2op ,, ", g mop A Then, step 607 is performed; otherwise, step 603 is returned.
其中, 如果感知失真在无法感知的范围内, 则人无法感知到由编 码器引入的量化噪声。 例如针对音频编码,人耳无法感知到由编码器 引入的量化噪声,再如针对视频编码, 人眼无法感知到由编码器引入 的量化噪声。在此, 具体的无法感知的范围是一个具体的允许失真的 数值范围。 判断量化感知失真是否在无法感知的范围内的具体方法 是:判断步骤 605中计算出来的量化感知失真的值是否在所述的允许 失真的数值范围, 如果是, 则认为量化感知失真无法感知, 否则, 认 为量化感知能够被感知。  Among them, if the perceptual distortion is within the range that cannot be perceived, the person cannot perceive the quantization noise introduced by the encoder. For example, for audio coding, the human ear cannot perceive the quantization noise introduced by the encoder, and as for video coding, the human eye cannot perceive the quantization noise introduced by the encoder. Here, the specific insensible range is a specific range of values that allow distortion. A specific method for determining whether the quantized perceptual distortion is in an unperceivable range is: determining whether the value of the quantized perceptual distortion calculated in step 605 is within a range of the allowable distortion, and if so, the quantized perceptual distortion is not perceived. Otherwise, quantitative perception is considered to be perceptible.
本实施例中,根据步骤 606的判断, 当量化感知失真能够被感知 到时, 如果重复上述的调整步骤 M次后, 量化感知失真仍能够被感 知到,则结束闭环选择,并从上述重复过程中调整得到的缩放因子中, 选择使得感知失真最小的一组缩放因子作为最佳缩放因子,然后执行 步骤 607。在实际应用中,闭环选择的次数 M可以根据实际情况确定。  In this embodiment, according to the judgment of step 606, when the quantized perceptual distortion can be perceived, if the quantized perceptual distortion can still be perceived after repeating the above-mentioned adjustment step M times, the closed loop selection is ended, and the repeated process is repeated from the above process. Among the scaling factors obtained in the adjustment, a set of scaling factors that minimize the perceptual distortion is selected as the optimal scaling factor, and then step 607 is performed. In practical applications, the number of closed-loop selections M can be determined according to actual conditions.
步骤 607: 用所得到的 m个最佳缩放因子 gg 对频 域采样值 X进行微调, 即每一块的频域采样值分别除以对应块的最 佳缩放因子, 得到微调后的频谱 , 具体表达式如下所示。 Step 607: Fine-tuning the frequency domain sample value X by using the obtained m optimal scaling factors g , g , that is, dividing the frequency domain sample value of each block by the optimal scaling factor of the corresponding block, and obtaining the fine-tuned spectrum. The concrete expression is as follows.
γ. |" (0,1,· · ·,«, ) Χ(η, + 1,^ + 2,- - ,^ )… X(nm_l + \, nm_, + 2,-, N)γ. |" (0,1,· · ·,«, ) Χ(η, + 1,^ + 2,- - ,^ )... X(n m _ l + \, n m _, + 2,- , N)
Figure imgf000010_0002
Slopt Smopt 通过以上步骤 601 ~ 607得到的微调后的频域采样值 送入编码 器。
Figure imgf000010_0002
Slopt Smopt sends the fine-tuned frequency-domain sampled values obtained in steps 601 to 607 above to the encoder.
考虑到解码时需要利用缩放因子恢复数据, 因此, 编码码流中需 要传输缩放因子。传输缩放因子的方式可以采用多种方式, 下面分别 介绍。  Considering that the scaling factor is required to recover the data during decoding, the scaling factor needs to be transmitted in the encoded code stream. The way to transfer the scaling factor can be done in a variety of ways, as described below.
传输缩放因子的方式一: 将用于微调频率采样值时的 m个缩放 因子^^^…,^^全部编码,这样,解码时能够较准确地恢复数据。 传输缩放因子的方式二: 在用于微调频率采样值时的 m个缩放 因子 g^,g2f.,gmP,中, 选择一个缩放因子作为基准缩放因子, 然后 计算其余 m- 1个缩放因子与该基准缩放因子的比值, 并编码这 m- 1 个比值。 例如, 将 作为基准缩放因子, 则只需要编码 即可。 这样, 可以减少消耗的比特数。Mode 1 for transmitting the scaling factor: m scaling used to fine tune the frequency sampled value The factors ^^^..., ^^ are all encoded, so that the data can be recovered more accurately when decoding. Mode 2 of transmitting the scaling factor: m scaling factors g^, g 2 f., g m when used to fine tune the frequency sampled value. P , , select a scaling factor as the reference scaling factor, then calculate the ratio of the remaining m - 1 scaling factors to the reference scaling factor, and encode the m - 1 ratio. For example, as a benchmark scaling factor, only coding is required. In this way, the number of bits consumed can be reduced.
Figure imgf000011_0001
Figure imgf000011_0001
传输缩放因子的方式三: 在用于微调频率采样值时的 m个缩放 因子
Figure imgf000011_0002
中, 选择一个缩放因子作为基准缩放因子, 然后 计算其余 m- 1个缩放因子与该基准缩放因子的比值, 并编码该基准 缩放因子和 m- 1个比值。 例如, 将 gl。p'作为基准缩放因子, 则需要 编码 以及^ L, ,...,¾L即可。 这样, 不仅可以减少消耗的比特
Mode 3 for transmitting the scaling factor: m scaling factors used to fine tune the frequency sampled value
Figure imgf000011_0002
Medium, selecting a scaling factor as a reference scaling factor, then calculating a ratio of the remaining m-1 scaling factors to the reference scaling factor, and encoding the reference scaling factor and m-1 ratios. For example, put gl . p' as the reference scaling factor, you need to encode and ^ L, ,..., 3⁄4L. In this way, not only can the consumed bits be reduced
Slopt
Figure imgf000011_0003
数, 而且由于解码端可以根据 g '以及 , ,...,¾^计算得到
Slopt
Figure imgf000011_0003
Number, and because the decoding end can be calculated according to g ' and , , ..., 3⁄4^
Figure imgf000011_0004
Figure imgf000011_0004
g20pt, ,gmop,, 从而还能够较准确地恢复数据。 为了在采用多个缩放因子时, 不占用较多的比特数, 可以根据编 码码率的要求以及量化质量的要求, 选择较佳的缩放因子的个数。 例 如, 在低码率编码中, 可以选择 2 ~3个缩放因子。 g 20pt , , g mop ,, and thus can recover data more accurately. In order to use a plurality of scaling factors without occupying a large number of bits, the number of preferred scaling factors can be selected according to the requirements of the coding rate and the quality of the quantization. For example, in low bit rate coding, 2 to 3 scaling factors can be selected.
步骤 503: 对通过多缩放因子控制得到的频域采样值 进行量 化, 输出量化频域采样值 9Step 503: Quantize the frequency domain sample value obtained by the multi-scaling factor control, and output the quantized frequency domain sample value 9 .
本步骤 503中,根据编码需求,可以采用不同的量化方式,例如, 多级矢量量化、 分裂矢量量化、 树形量化、 格形矢量量化等。  In step 503, different quantization methods may be used according to the coding requirements, for example, multi-level vector quantization, split vector quantization, tree quantization, lattice vector quantization, and the like.
步骤 504: 对步骤 503中得到的量化频率采样值 , 去除缩放因 子的影响, 恢复原始的频域采样值;^∞∞, 即对量化频率采样值;^进 行增益平衡后得到 ω∞∞。 根据步骤 502中传输缩放因子的方式不同,增益平衡的方法也要 釆用不同方式。 Step 504: The quantized frequency sample value obtained in step 503 is removed, and the original frequency domain sample value is restored; ^ ∞∞ , that is, the quantized frequency sample value is obtained; and the gain balance is obtained to obtain ω ∞∞ . Depending on how the scaling factor is transmitted in step 502, the method of gain balancing also uses different methods.
若传输缩放因子的方式为上述方式一或方式三, 则可利用步骤 502 中选择得到的多个缩放因子 ,^^…,^^进行增益平衡, 具体 为: 将量化频率采样值 ^也按照步骤 601 中的频段划分方式分为 m 个 部 分 , 得 到 If the mode of transmitting the scaling factor is the first mode or the third mode, the gain balancing can be performed by using multiple scaling factors selected in step 502, ^^..., ^^, specifically: the quantized frequency sampling value is also followed by steps. The frequency band division method in 601 is divided into m parts, and
Xg(0 ---,n,),Xq(nm^+\,nm_l+2,---,N),---,Xq(r +l,r +2,---,n2), 并对每一部 分的量化频率釆样值乘以相应部分的缩放因子, 其具体表达式如下:X g (0 ---, n,), X q (n m ^+\,n m _ l +2,---,N),---,X q (r +l,r +2, ---, n 2 ), and multiply the quantized frequency sample value of each part by the scaling factor of the corresponding part. The specific expression is as follows:
Xbalance = [Slop, ' Xg( ,l,- ,Π,), -g2optXq («, + 1, «, + 2, · · ·, «2 ), · · · , gmopt · Xq + 1, N)] 若传输缩放因子的方式为上述方式三,则可以利用多个缩放因子 的比例值进行增益平衡, 具体为: 将量化频率采样值 ^也按照步骤 X balance = [Slop, ' X g ( ,l,- ,Π,), -g 2opt X q («, + 1, «, + 2, · · ·, « 2 ), · · · , g mopt · X q + 1, N)] If the method of transmitting the scaling factor is the above method 3, the gain balance can be performed by using the scaling values of the plurality of scaling factors, specifically: the quantized frequency sampling value is also followed by steps.
601 中的频段划分方式分为 m 个部分, 得到 A^1'"''""), W 1,U2,-.、N) , ^(«,+1^+2,-,«2), 对基准缩放因子的相应 部分的频率釆样值乘以 1, 其余部分的量化频率釆样值均乘以相应部 分的缩放因子与基准缩放因子的比例值,假设将第一部分相应的缩放 因子 g 作为基准缩放因子, 则增益平衡的具体表达式如下: The frequency band division method in 601 is divided into m parts, and A^ 1 '"''""), W 1, U2, -., N), ^(«, +1^+2, -, « 2 ) are obtained. Multiplying the frequency sample value of the corresponding part of the reference scaling factor by 1, and the remaining part of the quantization frequency sample value is multiplied by the ratio of the scaling factor of the corresponding part to the reference scaling factor, assuming the first part of the corresponding scaling factor g As a reference scaling factor, the specific expression of the gain balance is as follows:
V  V
balance +l,N)
Figure imgf000012_0001
步骤 505: 对增益平衡后得到的 J^fl/ 进行逆时频变换, 将还原 的频域采样值 转换为还原的时域采样值 χ9(«)。 步骤 506:利用原始的时域采样值 和还原的时域采样值 («), 计算最佳全局增益 g ,。
Balance +l,N)
Figure imgf000012_0001
Step 505: Perform inverse time-frequency transform on the J^ fl/ obtained after the gain balance, and convert the restored frequency domain sample value into the restored time domain sample value χ 9 («). Step 506: Calculate the optimal global gain g by using the original time domain sample value and the restored time domain sample value («).
在此,可以将原始的时域采样值与还原的时域采样值之间的均方 误差最小的全局增益 作为最佳全局增益 。ρ', 即最佳全局增益 g gpi 使∑ [; c(«)-gg -xq(n)]2最小。 由此可以得出最佳全局增益为—: ggopt ― Here, the global gain that minimizes the mean square error between the original time domain sampled value and the restored time domain sampled value can be used as the optimal global gain. ρ', the optimal global gain g gpi , minimizes ; [; c(«)-g g -x q (n)] 2 . This gives the best global gain as -: ggopt -
∑ (")' ") 最佳全局增益 gg<?¾也需要编码传输, 用于解码端的数据恢复。 以上所述为在编码端通过多缩放因子调整量化盾量的流程。与编 码过程中进行的量化质量调整相应的,需要在解码端通过如图 7所示 的流程,根据解码后得到的量化频率采样值恢复时域采样值, 其具体 流程包括以下步骤: ∑ (")'") The best global gain g g<?3⁄4 also requires coded transmission for data recovery at the decoder. The above is the process of adjusting the quantized shield by the multi-scaling factor at the encoding end. Corresponding to the quantization quality adjustment performed in the encoding process, the decoding end needs to recover the time domain sampling value according to the quantized frequency sampling value obtained after decoding by the flow shown in FIG. 7, and the specific process includes the following steps:
步骤 701 : 利用从编码流中得到的缩放因子, 对量化频率釆样值 进行增益平衡。 其具体实现同步骤 504中所述的方法, 在此, 省略其 描述。 需要注意的是, 根据传输缩放因子的方式不同, 增益平衡的方 法也要采用不同方式, 而且, 编码端中的增益平衡方式和解码端中的 增益平衡方式也要一致。  Step 701: Perform gain balance on the quantized frequency sample value by using a scaling factor obtained from the encoded stream. The specific implementation is the same as the method described in step 504, and the description thereof is omitted here. It should be noted that the method of gain balancing is also different according to the way of transmitting the scaling factor, and the gain balancing mode in the encoding end and the gain balancing mode in the decoding end are also consistent.
步骤 702: 对增益平衡后得到的频域釆样值进行逆时频变换, 得 到时域采样值。  Step 702: Perform inverse time-frequency transform on the frequency domain sample value obtained after the gain balance, and obtain a time domain sample value.
步骤 703: 时域采样值乘以从编码流中得到的全局增益, 得到恢 复的时域采样值。  Step 703: The time domain sample value is multiplied by the global gain obtained from the encoded stream to obtain a recovered time domain sample value.
本实施例 1 所采用的多缩放因子控制的技术可以直接对时域的 采样值进行, 即可以适用于没有时频变换的情况, 相应的, 在计算全 局增益时, 没有逆时频变换过程。 针对这种情况, 在设置多缩放因子 时, 可以以时间段划分时域釆样值, 在调整多缩放因子时, 可以将较 重要时间段的缩放因子降低, 将不重要时间段的缩放因子提升。  The multi-scaling factor control technique used in the first embodiment can directly perform the sampling value in the time domain, that is, it can be applied to the case where there is no time-frequency transform, and correspondingly, when calculating the global gain, there is no inverse time-frequency transform process. In this case, when setting the multi-scaling factor, you can divide the time-domain sample value by time period. When adjusting the multi-scaling factor, you can reduce the scaling factor of the important time period and increase the scaling factor of the unimportant time period. .
实施例 2  Example 2
实施例 2提供一种通过多缩放因子和频谱整形调整量化盾量的 方法。  Embodiment 2 provides a method of adjusting the quantized shield by multi-scaling factors and spectral shaping.
图 8所示为实施例 2中调整量化质量的编码器示意框图,在编码 过程中, 时域采样值首先通过时频变换转换到频域, 然后通过频谱预 整形和多缩放因子控制后,进行量化并输出量化的采样值, 输出的量 化采样值通过增益平衡、频谱逆整形和逆时频变换后计算最佳全局增 益。 编码码流需要传输缩放因子、频域采样值的量化值以及全局增益 三个部分。  FIG. 8 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 2. In the encoding process, time domain sample values are first converted into a frequency domain by time-frequency transform, and then controlled by spectrum pre-shaping and multi-scaling factors. The quantized sample values are quantized and output, and the output quantized sample values are calculated by gain balance, spectral inverse shaping, and inverse time-frequency transform to calculate an optimal global gain. The coded stream needs to transmit the scaling factor, the quantized value of the frequency domain sampled value, and the global gain.
图 9所示为实施例 2中调整量化质量的解码器示意框图,在解码 过程中,量化频域采样值通过增益平衡、频谱逆整形和逆时频变换后, 得到时域采样值, 最后乘以全局增益即可还原时域采样值。 FIG. 9 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 2, in decoding In the process, the quantized frequency domain sampled values are obtained by gain balancing, spectral inverse shaping, and inverse time-frequency transform to obtain time domain sampled values, and finally multiplied by the global gain to restore the time domain sampled values.
在实施例 2中,通过多缩放因子和峰值整形调整量化质量的具体 步骤为, 在实施例 1中的图 5所示的流程的基础上, 在步骤 501所述 的时频变换和步骤 502所述的多缩放因子控制之间,进一步包括频谱 预整形的步骤,在步骤 504所述的增益平衡和步骤 505所述的逆时频 变换之间, 进一步包括频谱逆整形的步骤。 下面, 详细介绍频语预整 形和频语逆整形的具体实现方法。  In Embodiment 2, the specific steps of adjusting the quantization quality by the multi-scaling factor and the peak shaping are, based on the flow shown in FIG. 5 in Embodiment 1, the time-frequency transform and the step 502 described in the step 501. Between the multi-scaling factor control, the step of spectral pre-shaping further includes the step of spectrum inverse shaping between the gain balancing described in step 504 and the inverse time-frequency transform described in step 505. In the following, the specific implementation methods of frequency pre-shaping and frequency-language inverse shaping are introduced in detail.
图 10所示为实现频谱预整形的示意图,可以通过以下步骤实现。 步骤 1001 : 确定频谱整形区域, 并在该频谱整形区域内的步骤 Figure 10 shows a schematic diagram of spectrum pre-shaping, which can be implemented by the following steps. Step 1001: Step of determining a spectrum shaping area and performing the spectrum shaping area
501 中得到的频域釆样值中, 标记频域采样值的峰值集合 {pm 9m = 1,· · ·,Μ} 在此, 频镨整形区域是指较重要频段的频谱区域。 例如, 在音频 数据中, 由于人类听觉系统在低频处具有较高的频率分辨率, 因此, 低频部分认为是较重要频段; 再如, 在视频、 图像等数据中, 数据信 息大部分都集中在低频处, 因此,低频部分认为是较重要频段。因此, 频谱整形区域可以釆用全频段的前面部分, 例如, 可釆用前四分之一 等。 In the frequency domain sample obtained in 501, the peak set of the sampled frequency domain samples is {p m 9 m = 1, · · ·, Μ} Here, the frequency shaping region refers to the spectral region of the more important frequency band. For example, in audio data, since the human auditory system has a higher frequency resolution at a low frequency, the low frequency portion is considered to be a more important frequency band; for example, in video, image, and the like, most of the data information is concentrated in At low frequencies, therefore, the low frequency portion is considered to be a more important frequency band. Therefore, the spectrum shaping area can use the front part of the full frequency band, for example, the first quarter can be used.
在此, 所述的峰值 可以定义为整形频谱段幅值中的局部最大 值, 若 > X(j), V;€ [ - Δ,/ + Δ],/≠ j , 则 为 [ - Δ, + Δ]的 2Δ + 1点局 部的最大值, 其中, 局部区域可任意选择。  Here, the peak value may be defined as a local maximum value in the amplitude of the shaped spectrum segment, if > X(j), V; € [ - Δ, / + Δ], /≠ j , then [ - Δ, + Δ] The local maximum of 2 Δ + 1 point, where the local area can be arbitrarily selected.
步骤 1002: 计算用于频谱预整形的参考值 prefStep 1002: Calculate a reference value p ref for spectrum pre-shaping.
在此,选择参考值的原则是要保证参考值大小在频谱整形前后保 持不变。 本步骤 1002中, 可以将峰值集合 {Pm, = l,...,M}中的最大峰 值作为参考值; , 或者将最大局部能量作为参考值 ^^。 考虑到量化 误差的影响, 还可以将一块数据的特征参数作为参考值 ^, 以避免 量化误差对参考值产生较大的影响。较佳的, 参考值 / 可以选择为: 峰值集合 m, = l,...,M}中的最大峰值临近数据点的能量, 或者平均 能量等。 步骤 1003: 计算对峰值集合 {pm, = l,...,M}中每个峰值 的放大 因子 , R = r , t€(0,1) , 其中, 和 *可根据实际情况选择
Figure imgf000015_0001
Here, the principle of selecting the reference value is to ensure that the reference value remains unchanged before and after spectral shaping. In this step 1002, the maximum peak value in the peak set { Pm , = l, ..., M} may be used as a reference value; or the maximum local energy may be used as a reference value ^^. Considering the influence of the quantization error, the characteristic parameter of a piece of data can also be used as the reference value ^ to avoid the quantization error having a large influence on the reference value. Preferably, the reference value / can be selected as: the maximum peak value in the peak set m , = l, ..., M} is close to the energy of the data point, or the average energy. Step 1003: Calculate a magnification factor for each peak in the peak set {p m , = l,..., M}, R = r , t €(0,1) , where , and * can be selected according to actual conditions
Figure imgf000015_0001
适当的参数。 步骤 1004: 利用所计算出的峰值放大因子, 对峰值进行放大。 为了保证参考值; 7re/的不变性,对除了用于计算参考值 pre/相关的 峰值点之外, 对剩余的其它峰值点; ^乘以相应的放大因子 , 放大 后得到的峰值点为 = / ^。 Appropriate parameters. Step 1004: Amplify the peak using the calculated peak amplification factor. In order to ensure the reference value; 7 re/ invariance, for the remaining peak points other than the peak point used to calculate the reference value p re / correlation; ^ multiplied by the corresponding amplification factor, the peak point obtained after amplification For = / ^.
考虑到人类听觉系统在 <频处有 4艮高的频率分辨率,将低频部分 的峰值能量放大即可使得峰值能够被量化器捕捉, 因此,在本实施例 2中只对少量的频谱点即峰值进行放大。 本实施例中, 将这种频谱预 整形技术也可以称为峰值预整形。 采用这种峰值预整形技术,对全局 增益的增加影响较小,由全局增益增加引起的量化误差增加可以忽略 不计。 当然, 若考虑到使频谱整形的效果更好, 还可以对峰值周围的 频谱点进行放大, 例如, 对 2Δ+1点局部的峰值进行放大的同时, 还 可以对该峰值周围的 2Δ或少于 2Δ的点, 利用相应的放大因子进行放 大。 Considering that the human auditory system has a high frequency resolution of 4 艮 at the frequency, the peak energy of the low frequency portion is amplified so that the peak can be captured by the quantizer. Therefore, in the second embodiment, only a small number of spectral points are The peak is amplified. In this embodiment, the spectrum pre-shaping technique may also be referred to as peak pre-shaping. With this peak pre-shaping technique, the increase of the global gain is less affected, and the increase of the quantization error caused by the increase of the global gain is negligible. Of course, if you consider the effect of spectrum shaping better, you can also zoom in on the spectral points around the peak. For example, if you zoom in on the local peak of 2 Δ+1 point, you can also 2 Δ around the peak or A point less than 2 Δ is amplified by the corresponding amplification factor.
通过以上频讲预整形过程,提升较重要频段处的频域采样值的峰 值, 从而能够降低较重要频段的频域采样值较小峰值处的量化误差, 降低了较重要频段的频语峰值在量化中丟失的概率。  Through the above-mentioned frequency pre-shaping process, the peak value of the frequency domain sample value at the important frequency band is increased, thereby reducing the quantization error at the smaller peak of the frequency domain sample value of the important frequency band, and reducing the frequency peak value of the more important frequency band. The probability of loss in quantization.
在编码器中, 为了计算最佳全局增益, 还需要从量化频率釆样值 恢复得到时域采样值。 若采用频语预整形, 则在通过步骤 504所述的 增益平衡得到 Afl 后, 需要对 J^„进行频谱逆整形, 其具体实现过 程如图 11所示, 包括以下步骤: In the encoder, in order to calculate the optimal global gain, it is also necessary to recover the time domain sample values from the quantized frequency samples. If frequency pre-shaping is used, after A fl is obtained by the gain balance described in step 504, the spectrum inverse shaping needs to be performed on J^„ , and the specific implementation process is as shown in FIG. 11, and includes the following steps:
步骤 1101 : 在步骤 504中得到的 中, 标记频谱整形区域中 频域采样值的峰值集合 {^, = l,...,M}。 其中, 频谱逆整形过程中的 频谱整形区域和峰值标记准则应与频旙预整形过程中的相同。  Step 1101: In step 504, the peak set {^, = l, ..., M} of the frequency domain sampled values in the spectral shaping region is marked. The spectrum shaping area and the peak labeling criterion in the spectrum inverse shaping process should be the same as those in the frequency pre-shaping process.
步骤 1102: 计算用于频谱逆整形的参考值^ β 其中, 频谱逆整 形过程中的参考值计算准则也应与频语预整形过程中的相同。 例如, 若在频谱预整形过程中, 采用峰值集合 , = l,...,M}中的最大峰值 临近数据点的能量作为参考值, 则在频谱逆整形过程中, 也应采用峰 值集合 mW = l,...,M}中的最大峰值临近数据点的能量作为参考值。 Step 1102: Calculate a reference value for spectrum inverse shaping, where β , spectrum inverse The reference value calculation criterion in the shape process should also be the same as in the frequency pre-shaping process. For example, if the peak of the peak set, = l, ..., M} is used as the reference value in the spectrum pre-shaping process, the peak set m should also be used in the spectrum inverse shaping process. The maximum peak value in , W = l, ..., M} is near the energy of the data point as a reference value.
步骤 1103: 计算对峰值集合 m, = l,...,M}中每个峰值^的缩小 因子 rm = C, 和*应与频谱预整形过
Figure imgf000016_0001
Step 1103: Calculate the reduction factor r m = C for each peak in the peak set m , = l, ..., M}, and * should be pre-shaped with the spectrum
Figure imgf000016_0001
程中的参数一致 其中, 在频谱逆整形过程中的缩小因子 ^的计算原理如下: 在频 谱预整形过程中, 放大因子为 ? = cf ) , * e (0,l), 如果某峰值点大 The parameters in the process are consistent. The calculation principle of the reduction factor ^ in the spectrum inverse shaping process is as follows: In the spectrum pre-shaping process, the amplification factor is ? = cf ) , * e (0, l), if a certain peak point is large
P J  P J
H / ,根据该 达 H / , according to the
Figure imgf000016_0002
由上述在频潘逆整形过程中计算缩小因子的原理可以得到,在编 码流中无需传输用于频谱逆整形的参考值,在解码端也可以按照上述 原理, 可以利用解码端的采样值本身的特性, 计算得到用于频谱逆整 形的参考值, 进一步可以计算出相应峰值的缩小因子, 从而不占用额 外的比特数。
Figure imgf000016_0002
It can be obtained from the above principle that the reduction factor is calculated in the frequency pan inverse shaping process, and it is not necessary to transmit the reference value for spectral inverse shaping in the encoded stream, and the decoding end can also utilize the characteristics of the sampled value of the decoding end according to the above principle. Calculate the reference value for spectral inverse shaping, and further calculate the reduction factor of the corresponding peak, without taking up extra bits.
步骤 1104: 利用所计算出的峰值缩小因子, 对峰值进行缩小。 在频谱逆整形过程中, 应对在频谱预整形过程中放大的峰值进行缩 小。 如果在频语预整形过程中, 对除了用于计算参考值之外的其它峰 值点进行了放大, 则在频谱逆整形过程中, 也需要对用于计算参考值 之外的其它峰值点进行缩小, 即对除了用于计算参考值^相关的峰 值点之外, 对剩余的其它峰值点 ^除以相应的缩小因子 , 缩小后得 到的峰值点为 = /rmStep 1104: The peak value is reduced by using the calculated peak reduction factor. In the spectrum inverse shaping process, the peaks amplified during the spectrum pre-shaping process should be reduced. If in the frequency pre-shaping process, other peak points other than the reference value are used for amplification, in the spectrum inverse shaping process, other peak points other than the reference value are also required to be reduced. , that is, in addition to the peak point used to calculate the reference value ^, the remaining other peak points ^ are divided by the corresponding reduction factor, and then reduced The peak point to be reached is = /r m .
通过以上步骤进行频谱逆整形后,在步骤 505中对频语逆整形后 得到的频域采样值进行逆时频变换。  After performing spectrum inverse shaping by the above steps, in step 505, the frequency domain sampled values obtained after inverse frequency shaping are inverse-time-transformed.
本实施例 2中,由于在编码过程中在时频变换和多缩放因子控制 之间进行了频谱预整形,相应的, 在解码端也需要在增益平衡和逆时 频变换之间进行频谱逆整形 ,具体的实现方法如同在上述编码过程中 进行的频语逆整形方法, 在此省略其描述。  In the second embodiment, since spectrum pre-shaping is performed between the time-frequency transform and the multi-scaling factor control in the encoding process, correspondingly, the spectrum inverse plasticizing between the gain balance and the inverse time-frequency transform is also required at the decoding end. The specific implementation method is the same as the frequency inverse processing method performed in the above encoding process, and the description thereof is omitted here.
上面所述的本实施例 2中, 先进行频语预整形, 然后再进行多缩 放因子的控制。 同样的, 在编码过程中, 还可以先进行多缩放因子的 控制, 然后再进行频谱预整形, 相应的, 在编码过程中的恢复原始釆 样值的过程中和在解码过程中, 可以先进行频谱逆整形, 然后再进行 增益平衡。 针对这种情况, 不予详细的介绍。  In the second embodiment described above, the frequency pre-shaping is performed first, and then the multi-scaling factor is controlled. Similarly, in the encoding process, multi-scaling factor control may be performed first, and then spectrum pre-shaping is performed. Correspondingly, during the process of restoring the original sample value in the encoding process and during the decoding process, the first process may be performed first. The spectrum is inversely shaped and then gain balanced. In this case, no detailed introduction will be made.
实施例 3  Example 3
实施例 3提供一种通过频谱整形调整量化质量的方法。  Embodiment 3 provides a method of adjusting quantization quality by spectral shaping.
图 12所示为实施例 3中调整量化质量的编码器示意框图, 在编 码过程中, 时域采样值首先通过时频变换转换到频域, 然后通过频谱 预整形后, 进行量化并输出量化的釆样值, 输出的量化采样值通过频 语逆整形和逆时频变换后计算最佳全局增益。编码码流需要传输频域 采样值的量化值以及全局增益三个部分。  FIG. 12 is a schematic block diagram of an encoder for adjusting quantization quality in Embodiment 3. In the encoding process, time domain sample values are first converted into a frequency domain by time-frequency transform, and then quantized by spectrum pre-shaping, and quantized. The sampled value, the output quantized sample value is calculated by the frequency inverse inverse transform and the inverse time-frequency transform to calculate the optimal global gain. The coded stream needs to transmit the quantized value of the frequency domain sampled value and the global gain three parts.
图 13所示为实施例 3中调整量化质量的解码器示意框图, 在解 码过程中, 量化频域采样值通过频语逆整形和逆时频变换后, 得到时 域采样值, 最后乘以全局增益即可还原时域釆样值。  FIG. 13 is a schematic block diagram of a decoder for adjusting quantization quality in Embodiment 3. In the decoding process, the quantized frequency domain sample values are obtained by frequency inverse transform and inverse time-frequency transform to obtain time domain sample values, and finally multiplied by global values. Gain restores the time domain sample value.
在本实施例 3中, 频 i普预整形以及频谱逆整形的方法与实施例 2 中的实现方法和所得到的技术效果一致, 在此不再详细阐述。  In the third embodiment, the method of frequency pre-shaping and spectrum inverse shaping is consistent with the implementation method and the obtained technical effects in Embodiment 2, and will not be described in detail herein.
实施例 4  Example 4
实施例 4给出调整量化质量的实现装置。  Embodiment 4 gives an implementation device for adjusting the quantization quality.
与实施例 2所述的方法相对应, 图 14所示为实施例 4中在编码 端调整量化质量的装置结构图。 如图 14所示, 在编码端调整量化质 量的装置包括: 时频变换单元, 频语预整形单元, 多缩放因子控制单 元, 量化单元, 增益平衡单元, 频谱逆整形单元, 逆时频变换单元, 全局增益计算单元。 其中, 所述时频变换单元接收第一采样值, 并对 第一采样值进行时频变换后,输出给所述频谱预整形单元; 所述频谱 预整形单元接收所述时频变换单元输出的第一采样值,对该第一采样 值进行频语预整形后输出给所述多缩放因子控制单元;所述多缩放因 子控制单元接收第一采样值,对第一采样值设置两个或两个以上缩放 因子, 利用缩放因子对第一采样值进行调整, 将调整后的第一采样值 输出给所述量化单元;所述量化单元对所接收的第一采样值进行量化 得到量化采样值并输出给所述增益平衡单元;所述增益平衡单元接收 量化采样值, 从量化采样值中去除缩放因子的影响得到第二采样值 , 并输出给所述频语逆整形单元;所述频谱逆整形单元接收所述增益平 衡单元输出的第二采样值,对该第二采样值进行频谱逆整形后输出给 所述逆时频变换单元;所述逆时频变换单元从所述峰值逆整形单元中 接收第二采样值, 并对第二采样值进行逆时频变换后,输出给所述全 局增益计算单元; 全局增益计算单元接收第一采样值和第二采样值, 并利用第一采样值和第二采样值得到全局增益。 Corresponding to the method described in Embodiment 2, FIG. 14 is a block diagram showing the configuration of the apparatus for adjusting the quantization quality at the encoding end in Embodiment 4. As shown in FIG. 14, the apparatus for adjusting the quantization quality at the encoding end includes: a time-frequency transform unit, a frequency pre-shaping unit, and a multi-scaling factor control list. Element, quantization unit, gain balance unit, spectrum inverse shaping unit, inverse time-frequency transform unit, global gain calculation unit. The time-frequency transform unit receives the first sampled value, and performs time-frequency transform on the first sampled value, and outputs the result to the spectrum pre-shaping unit. The spectrum pre-shaping unit receives the output of the time-frequency transform unit. a first sample value, which is frequency-pre-shaped and output to the multi-scale factor control unit; the multi-scale factor control unit receives the first sample value, and sets two or two on the first sample value And more than one scaling factor, adjusting the first sampling value by using a scaling factor, and outputting the adjusted first sampling value to the quantization unit; the quantization unit quantizing the received first sampling value to obtain a quantized sampling value and Outputting to the gain balancing unit; the gain balancing unit receives the quantized sample value, removes the influence of the scaling factor from the quantized sample value to obtain a second sampled value, and outputs the same to the frequency inverse inverse shaping unit; The unit receives the second sample value output by the gain balancing unit, performs spectral inverse shaping on the second sample value, and outputs the result to the inverse time-frequency transform unit; The inverse time-frequency transform unit receives the second sampled value from the peak inverse shaping unit, and performs inverse time-frequency transform on the second sampled value, and outputs the same to the global gain calculating unit; the global gain calculating unit receives the first sampling The value and the second sample value, and the first sample value and the second sample value are used to obtain a global gain.
所述多缩放因子控制单元包括:缩放因子设置单元和采样值调整 单元;所述缩放因子设置单元用于对第一采样值设置两个或两个以上 缩放因子, 并将所设置的缩放因子输出给所述采样值调整单元; 所述 釆样值调整单元用于接收缩放因子,并利用缩放因子对第一釆样值进 行调整。  The multi-scale factor control unit includes: a scaling factor setting unit and a sample value adjusting unit; the scaling factor setting unit is configured to set two or more scaling factors for the first sampling value, and output the set scaling factor And the sample value adjustment unit is configured to receive a scaling factor, and adjust the first sample value by using a scaling factor.
所述缩放因子设置单元包括: 基准值设置单元、 缩放因子调整单 元、 消耗比特数估计单元、 感知失真计算单元; 所述基准值设置单元 用于设置缩放因子的基准值, 并输出给所述缩放因子调整单元; 所述 缩放因子调整单元用于根据基准值调整缩放因子,并输出给所述消耗 比特数估计单元和感知失真计算单元;所述消耗比特数估计单元用于 根据缩放因子, 估计消耗比特数, 并判断消耗比特数是否小于编码所 允许的总比特数, 将判断结果发送给所述缩放因子调整单元; 所述感 知失真计算单元用于根据缩放因子,计算感知失真, 并判断感知失真 是否在无法感知的范围内, 将判断结果发送给所述缩放因子调整单 元。 The scaling factor setting unit includes: a reference value setting unit, a scaling factor adjusting unit, a consumption bit number estimating unit, and a perceptual distortion calculating unit; the reference value setting unit is configured to set a reference value of the scaling factor, and output the scaling value to the scaling a factor adjustment unit; the scale factor adjustment unit is configured to adjust a scaling factor according to a reference value, and output the result to the consumption bit number estimation unit and the perceptual distortion calculation unit; the consumption bit number estimation unit is configured to estimate consumption according to a scaling factor The number of bits, and determining whether the number of consumed bits is smaller than the total number of bits allowed by the encoding, and transmitting the determination result to the scaling factor adjusting unit; the perceptual distortion calculating unit is configured to calculate the perceptual distortion according to the scaling factor, and determine the perceptual distortion Whether the result of the determination is sent to the scaling factor adjustment unit within a range that is not perceptible.
所述频讲预整形单元包括: 峰值标记单元、 参考值计算单元、 放 大因子计算单元、预整形单元; 其中, 所述峰值标记单元用于接收第 一采样值, 并在频谱整形区域内的第一采样值中, 标记峰值, 输出给 所述参考值计算单元;所述参考值计算单元用于利用峰值计算用于频 语预整形的参考值, 输出给所述放大因子计算单元; 所述放大因子计 算单元用于利用参考值, 计算各标记峰值的放大因子, 输出给所述预 整形单元; 所述预整形单元用于利用所述放大因子,对频谱进行预整 形。  The frequency pre-shaping unit includes: a peak marking unit, a reference value calculating unit, an amplification factor calculating unit, and a pre-shaping unit; wherein the peak marking unit is configured to receive the first sampling value and is in the spectrum shaping area a sample value, which is output to the reference value calculation unit; the reference value calculation unit is configured to calculate a reference value for frequency pre-shaping using a peak value, and output the result to the amplification factor calculation unit; The factor calculation unit is configured to calculate, by using the reference value, an amplification factor of each flag peak, and output the signal to the pre-shaping unit; the pre-shaping unit is configured to pre-shape the spectrum by using the amplification factor.
所述频讲逆整形单元包括: 峰值标记单元、 参考值计算单元、 缩 小因子计算单元、 逆整形单元; 其中, 所述峰值标记单元用于接收采 样值, 并在频谱整形区域内的采样值中, 标记峰值, 输出给所述参考 值计算单元;所述参考值计算单元用于利用峰值计算用于频语逆整形 的参考值, 输出给所述缩小因子计算单元; 所述缩小因子计算单元用 于利用参考值,计算各标记峰值的缩小因子,输出给所述逆整形单元; 所述逆整形单元用于利用所述缩小因子, 对频讲进行逆整形。  The frequency inverse transforming unit includes: a peak labeling unit, a reference value calculating unit, a reduction factor calculating unit, and an inverse shaping unit; wherein the peak labeling unit is configured to receive the sampling value and is in the sampling value in the spectrum shaping area. Marking a peak value, which is output to the reference value calculation unit; the reference value calculation unit is configured to calculate a reference value for frequency inverse transformation using a peak value, and output the result to the reduction factor calculation unit; The reduction factor of each marker peak is calculated by using the reference value, and is output to the inverse shaping unit. The inverse shaping unit is configured to perform inverse shaping on the frequency using the reduction factor.
与实施例 2所述的方法相对应, 图 15所示为本实施例 4中在解 码端调整量化质量的装置结构图。 如图 15所示, 在解码端调整量化 质量的装置包括: 增益平衡单元、 频谱逆整形单元、 逆时频变换单元 以及全局增益平衡单元。 其中, 所述增益平衡单元用于接收量化采样 值和缩放因子, 并利用所接收的缩放因子, 从量化采样值中去除缩放 因子的影响得到采样值, 并输出给所述频谱逆整形单元; 所述频讲逆 整形单元接收所述增益平衡单元输出的采样值,对该釆样值进行频谱 逆整形后输出给所述逆时频变换单元;所述逆时频变换单元从所述频 谱逆整形单元中接收采样值, 并对采样值进行逆时频变换后, 输出给 所述全局增益平衡单元;所述全局增益平衡单元接收全局增益和采样 值, 并对采样值乘以全局增益后输出。 全局增益平衡单元可以是乘法 器。 与编码端相同的, 所述解码端的频谱逆整形单元包括: 峰值标记 单元、 参考值计算单元、 缩小因子计算单元、 逆整形单元; 其中, 所 述峰值标记单元接收釆样值, 并在频谱整形区域内的采样值中, 标记 峰值, 输出给所述参考值计算单元; 所述参考值计算单元用于利用峰 值计算用于频谱逆整形的参考值,输出给所述缩小因子计算单元; 所 述缩小因子计算单元用于利用参考值, 计算各标记峰值的缩小因子, 输出给所述逆整形单元; 所述逆整形单元用于利用所述缩小因子, 对 频谱进行逆整形。 Corresponding to the method described in Embodiment 2, FIG. 15 is a block diagram showing the structure of the apparatus for adjusting the quantization quality at the decoding end in the fourth embodiment. As shown in FIG. 15, the apparatus for adjusting the quantization quality at the decoding end includes: a gain balancing unit, a spectrum inverse shaping unit, an inverse time-frequency transform unit, and a global gain balancing unit. The gain balancing unit is configured to receive the quantized sample value and the scaling factor, and use the received scaling factor to remove the influence of the scaling factor from the quantized sample value to obtain a sampled value, and output the sampled value to the spectral inverse shaping unit; The inverse frequency shaping unit receives the sampled value output by the gain balancing unit, performs spectral inverse shaping on the sampled value, and outputs the sampled value to the inverse time-frequency transform unit; the inverse time-frequency transform unit inversely shapes the spectrum from the spectrum The sampling value is received in the unit, and the sampled value is inverse-time-converted and output to the global gain balancing unit; the global gain balancing unit receives the global gain and the sampled value, and multiplies the sampled value by the global gain and outputs the sampled value. The global gain balancing unit can be a multiplier. The spectrum inverse inverse unit of the decoding end is the same as the encoding end, and includes: a peak mark a unit, a reference value calculation unit, a reduction factor calculation unit, and an inverse shaping unit; wherein the peak marker unit receives the sample value, and marks a peak value in the sampled value in the spectrum shaping region, and outputs the peak value to the reference value calculation unit The reference value calculation unit is configured to calculate a reference value for spectral inverse shaping using a peak value, and output the reference value to the reduction factor calculation unit; the reduction factor calculation unit is configured to calculate a reduction factor of each marker peak value by using a reference value, And outputting to the inverse shaping unit; the inverse shaping unit is configured to perform inverse shaping on the spectrum by using the reduction factor.
当然, 与上述的实施例 1、 3所述的方法相对应, 以及具体实现 方法相对应, 可以釆用不同结构的调整量化质量的装置, 装置中的各 单元的功能已在上面详细介绍, 在此, 不再详细阐述。  Of course, corresponding to the methods described in Embodiments 1 and 3 above, and corresponding to the specific implementation method, devices for adjusting the quantization quality of different structures may be used, and the functions of the units in the device have been described in detail above. Therefore, it will not be elaborated.
以上所述的实施例可以应用于音频编码、视频编码、 图像编码等 各种编码领域中。  The embodiments described above can be applied to various coding fields such as audio coding, video coding, and image coding.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解 到本发明可借助软件加必需的通用硬件平台的方式来实现, 当然也可 以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解, 本发明的技术方案本盾上或者说对现有技术做出贡献的部分可以以 软件产品的形式体现出来, 该计算机软件产品存储在一个存储介质 中, 包括若干指令用以使得一台计算机设备(可以是个人计算机, 服 务器, 或者网络设备等)执行本发明各个实施例所述的方法。 以上公 开的仅为本发明的几个具体实施例, 但是, 本发明并非局限于此, 任 何本领域的技术人员能思之的变化都应落入本发明的保护范围。  Through the description of the above embodiments, those skilled in the art can clearly understand that the present invention can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is a better implementation. the way. Based on such understanding, the technical solution of the present invention may also be embodied in the form of a software product, which is stored in a storage medium, and includes a plurality of instructions for making A computer device (which may be a personal computer, server, or network device, etc.) performs the methods described in various embodiments of the present invention. The above is only a few specific embodiments of the present invention, but the present invention is not limited thereto, and any changes that can be made by those skilled in the art should fall within the protection scope of the present invention.
以上所述仅为本发明实施例的过程及方法实施例,并不用以 P艮制 本发明实施例, 凡在本发明实施例的精神和原则之内所做的任何修 改、 等同替换、 改进等, 均应包含在本发明实施例的保护范围之内。  The above is only the process and method embodiments of the embodiments of the present invention, and is not intended to be used in the embodiments of the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principles of the embodiments of the present invention. All should be included in the scope of protection of the embodiments of the present invention.

Claims

权利要求 Rights request
1、 一种在编码中调整量化质量的方法, 其特征在于, 该方法包 括: A method of adjusting quantization quality in coding, characterized in that the method comprises:
利用两个或两个以上缩放因子,对用于编码的第一釆样值进行调 整后, 对调整后的第一釆样值进行量化得到量化采样值;  After adjusting the first sample value used for encoding by using two or more scaling factors, quantizing the adjusted first sample value to obtain a quantized sample value;
从所得到的量化采样值中去除缩放因子的影响得到第二采样值 , 利用第一采样值和第二采样值得到全局增益;  Removing the influence of the scaling factor from the obtained quantized sample values to obtain a second sample value, and obtaining a global gain by using the first sample value and the second sample value;
将所得到的量化釆样值、所述两个或两个以上的缩放因子的信息 以及所得到的全局增益作为编码流输出。  The obtained quantized sample value, the information of the two or more scaling factors, and the obtained global gain are output as an encoded stream.
2、 根据权利要求 1所述的方法, 其特征在于,  2. The method of claim 1 wherein
所述第一釆样值和第二采样值为时域的采样值;  The first sample value and the second sample value are sample values in a time domain;
在对第一采样值进行调整之前, 进一步包括: 将时域的第一采样 值转换为频域的第一釆样值;  Before adjusting the first sample value, the method further includes: converting the first sample value of the time domain to the first sample value of the frequency domain;
所述利用缩放因子对第一采样值进行调整为: 利用缩放因子, 对 频域的第一采样值进行调整;  The first sampling value is adjusted by using a scaling factor to: adjust a first sampling value in a frequency domain by using a scaling factor;
所述对调整后的第一采样值进行量化得到量化釆样值为:对调整 后的频域的第一采样值进行量化得到量化采样值;  And performing quantization on the adjusted first sample value to obtain a quantized sample value: quantizing the first sampled value in the adjusted frequency domain to obtain a quantized sample value;
所述从量化采样值中得到第二采样值为:从量化采样值中去除缩 放因子的影响得到频域的第二釆样值;  And obtaining, by the quantized sample value, a second sample value: removing a influence of the scaling factor from the quantized sample value to obtain a second sample value in the frequency domain;
在得到第二采样值之后, 得到全局增益之前, 进一步包括: 将频 域的第二釆样值转换为时域的第二采样值;  After obtaining the second sample value, before obtaining the global gain, the method further includes: converting the second sample value in the frequency domain to the second sample value in the time domain;
所述利用第一采样值和第二釆样值得到全局增益为: 利用时域的 第一采样值和时域的第二采样值得到全局增益。 The obtaining the global gain by using the first sample value and the second sample value is: using time domain The first sampled value and the second sampled value in the time domain result in a global gain.
3、 根据权利要求 2所述的方法, 其特征在于,  3. The method of claim 2, wherein
所述将时域的第一采样值转换为频域的第一采样值为:通过离散 傅立叶变换, 或快速傅立叶变换, 或离散余弦变换, 或小波变换, 将 时域的第一釆样值转换为频域的第一采样值。  Converting the first sample value of the time domain to a first sample value of the frequency domain: converting the first sample value of the time domain by a discrete Fourier transform, or a fast Fourier transform, or a discrete cosine transform, or a wavelet transform Is the first sampled value of the frequency domain.
4、 根据权利要求 2所述的方法, 其特征在于,  4. The method of claim 2, wherein
所述两个或两个以上缩放因子为:对频域的第一采样值设置的两 个或两个以上缩放因子。  The two or more scaling factors are: two or more scaling factors set for the first sampled value of the frequency domain.
5、 根据权利要求 4所述的方法, 其特征在于,  5. The method of claim 4, wherein
所述对频域的第一采样值设置两个或两个以上缩放因子为:将频 域的第一采样值划分为两个或两个以上部分,并对各部分分别设置一 个缩放因子。  The first sample value of the frequency domain is set to two or more scaling factors: dividing the first sample value of the frequency domain into two or more parts, and respectively setting a scaling factor for each part.
6、 根据权利要求 5所述的方法, 其特征在于,  6. The method of claim 5, wherein
所述利用缩放因子对频域的第一采样值进行调整为:对各部分的 频域的第一采样值, 分别利用对应部分的缩放因子进行调整。  The first sampling value in the frequency domain is adjusted by using a scaling factor to: adjust the first sampling value of the frequency domain of each part by using a scaling factor of the corresponding part.
7、 根据权利要求 6所述的方法, 其特征在于,  7. The method of claim 6 wherein:
所述从所得到的量化采样值中去除缩放因子的影响为:按照划分 所述频域的第一采样值的方式,将量化采样值划分为相应的两个或两 个以上部分, 并利用各部分的缩放因子, 从相应部分的量化采样值中 去除对应部分的缩放因子的影响。  The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the first sample values of the frequency domain, and using each The partial scaling factor removes the influence of the scaling factor of the corresponding portion from the quantized sample values of the corresponding portion.
8、 根据权利要求 7所述的方法, 其特征在于,  8. The method of claim 7 wherein:
所述两个或两个以上的缩放因子的信息作为编码流输出为:将所 述两个或两个以上的缩放因子作为编码流输出。 The information of the two or more scaling factors is output as an encoded stream as: Two or more scaling factors are described as the encoded stream output.
9、 根据权利要求 6所述的方法, 其特征在于,  9. The method of claim 6 wherein:
对各部分分别设置缩放因子后, 进一步包括: 选择其中一个部分 的缩放因子作为基准缩放因子,计算其余部分的缩放因子与该基准缩 放因子的比值;  After the scaling factors are respectively set for each part, the method further includes: selecting a scaling factor of one of the parts as a reference scaling factor, and calculating a ratio of the scaling factor of the remaining part to the reference scaling factor;
所述从所得到的量化采样值中去除缩放因子的影响为:按照划分 所述频域的第一釆样值的方式 ,将量化采样值划分为相应的两个或两 个以上部分, 并利用所得到的比值,从相应部分的量化采样值中去除 对应部分的缩放因子的影响。  The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the first sample values of the frequency domain, and utilizing The resulting ratio removes the effect of the scaling factor of the corresponding portion from the quantized sample values of the corresponding portion.
10、 根据权利要求 9所述的方法, 其特征在于, 所述两个或两个 以上的缩放因子的信息作为编码流输出为:将所述其余部分的缩放因 子与该基准缩放因子的比值作为编码流输出。  10. The method according to claim 9, wherein the information of the two or more scaling factors is output as an encoded stream as: a ratio of a scaling factor of the remaining portion to the reference scaling factor is used as Encoded stream output.
11、 根据权利要求 9所述的方法, 其特征在于,  11. The method of claim 9 wherein:
所述从所得到的量化采样值中去除缩放因子的影响为:按照划分 所述频域的第一采样值的方式,将量化采样值划分为相应的两个或两 个以上部分,并利用基准缩放因子和所得到的比值计算得到各部分的 缩放因子, 利用各部分的缩放因子, 从相应部分的量化采样值中去除 对应部分的缩放因子的影响。  The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the first sample values of the frequency domain, and using the reference The scaling factor and the obtained ratio are calculated to obtain the scaling factor of each part, and the scaling factor of each part is used to remove the influence of the scaling factor of the corresponding part from the quantized sample values of the corresponding part.
12、 根据权利要求 11 所述的方法, 其特征在于, 所述两个或两 个以上的缩放因子的信息作为编码流输出为:将所述基准缩放因子以 及所述其余部分的缩放因子与该基准缩放因子的比值作为编码流输 出。 12. The method according to claim 11, wherein the information of the two or more scaling factors is output as an encoded stream: the reference scaling factor and a scaling factor of the remaining portion are The ratio of the reference scaling factor is output as an encoded stream.
13、 根据权利要求 6所述的方法, 其特征在于, 所述对各部分分别设置一个缩放因子为:根据消耗比特数和感知 失真, 调整各部分的缩放因子得到各部分的最佳缩放因子。 The method according to claim 6, wherein each of the portions is set with a scaling factor of: adjusting the scaling factor of each part according to the number of consumed bits and the perceptual distortion to obtain an optimal scaling factor of each part.
14、 根据权利要求 13所述的方法, 其特征在于,  14. The method of claim 13 wherein:
所述调整各部分的缩放因子得到最佳缩放因子为:  The adjustment of the scaling factor of each part to obtain the optimal scaling factor is:
设置缩放因子的基准值,该基准值使消耗比特数小于编码所允许 的总比特数;  Setting a reference value of the scaling factor, the reference value making the number of consumed bits smaller than the total number of bits allowed by the encoding;
将各部分的缩放因子在该基准值的基础上进行调整;  Adjusting the scaling factor of each part based on the reference value;
判断调整的缩放因子是否使消耗比特数小于编码所允许的总比 特数, 如果不满足该条件, 则直到满足该条件为止继续执行调整缩放 因子的步骤, 如果满足该条件, 则计算感知失真;  Determining whether the adjusted scaling factor is such that the number of consumed bits is less than the total number of bits allowed by the encoding. If the condition is not satisfied, the step of adjusting the scaling factor is continued until the condition is satisfied, and if the condition is satisfied, the perceptual distortion is calculated;
判断感知失真是否在无法感知的范围内, 如果是, 则将本次调整 得到的缩放因子作为最佳缩放因子,否则,返回调整缩放因子的步骤, 重复调整缩放因子的步骤以及后续步骤。  Determine whether the perceptual distortion is within the range that cannot be perceived. If yes, the scaling factor obtained by this adjustment is used as the optimal scaling factor. Otherwise, the step of adjusting the scaling factor is returned, and the step of adjusting the scaling factor and subsequent steps are repeated.
15、 根据权利要求 14所述的方法, 其特征在于, 所述消耗比特 数根据频域的第一采样值、频域的第一采样值的个数以及缩放因子估 计得到。  The method according to claim 14, wherein the number of consumed bits is estimated according to a first sample value in the frequency domain, a number of first sample values in the frequency domain, and a scaling factor.
16、 根据权利要求 14所述的方法, 其特征在于, 所述感知失真 根据频域的第一釆样值和各部分的缩放因子得到。  16. The method according to claim 14, wherein the perceptual distortion is obtained according to a first sample value of a frequency domain and a scaling factor of each part.
17、 根据权利要求 14所述的方法, 其特征在于,  17. The method of claim 14 wherein:
当感知失真在感知的范围内时,重复调整缩放因子的步骤以及后 续步骤规定数次; 如果重复规定数次后, 感知失真仍在感知的范围内, 则从上述重 复过程中调整的缩放因子中,选择使感知失真最小的缩放因子作为最 佳缩放因子。 When the perceptual distortion is within the range of perception, the step of repeatedly adjusting the scaling factor and the subsequent steps are specified several times; If the perceptual distortion is still within the perceived range after repeated times, the scaling factor that minimizes the perceptual distortion is selected as the optimal scaling factor from the scaling factors adjusted in the above repetition process.
18、 根据权利要求 14所述的方法, 其特征在于,  18. The method of claim 14 wherein:
所述将各部分的缩放因子在该基准值的基础上进行调整为:将重 要频段部分的缩放因子在基准值的基础上降低,将不重要频段部分的 缩放因子在基准值的 ^出上提升。  The scaling factor of each part is adjusted on the basis of the reference value to: reduce the scaling factor of the important frequency band portion on the basis of the reference value, and increase the scaling factor of the unimportant frequency band portion on the reference value .
19、 根据权利要求 18所述的方法, 其特征在于,  19. The method of claim 18, wherein
所述重要频段为氏频段, 所述不重要频段为高频段。  The important frequency band is a frequency band, and the unimportant frequency band is a high frequency band.
20、 根据权利要求 2所述的方法, 其特征在于,  20. The method of claim 2, wherein
在对频域的第一釆样值利用缩放因子进行调整之前, 进一步包 括: 对频域的第一釆样值进行频谱预整形;  Before the first sample value in the frequency domain is adjusted by using the scaling factor, the method further includes: performing spectrum pre-shaping on the first sample value in the frequency domain;
从量化采样值中去除缩放因子的影响得到频域的第二采样值之 后, 转换为时域的第二采样值之前, 进一步包括: 对频域的第二采样 值进行频谱逆整形。  After the second sampling value in the frequency domain is obtained by removing the influence of the scaling factor from the quantized sample value, before converting to the second sampling value in the time domain, the method further includes: performing spectral inverse shaping on the second sampling value in the frequency domain.
21、 根据权利要求 2所述的方法, 其特征在于,  21. The method of claim 2, wherein
在对频域的第一釆样值利用缩放因子进行调整之后 ,进行量化之 前, 进一步包括: 对调整后的频域的第一采样值进行频语预整形; 在量化后, 从量化采样值中去除缩放因子的影响之前, 进一步包 括: 对量化采样值进行频谱逆整形。  After the first sample value in the frequency domain is adjusted by using the scaling factor, before performing the quantization, the method further includes: performing frequency pre-shaping on the first sampled value in the adjusted frequency domain; after the quantization, from the quantized sample value Before removing the influence of the scaling factor, the method further includes: performing spectral inverse shaping on the quantized sample values.
22、 根据权利要求 20或 21所述的方法, 其特征在于, 确定频谱整形区域; 所述对采样值进行频谱预整形为:对所确定的频谱整形区域内的 釆样值进行频谱预整形; The method according to claim 20 or 21, wherein the spectrum shaping area is determined; Performing spectral pre-shaping on the sampled value to perform spectrum pre-shaping on the determined sample-shaped value in the spectral shaping region;
所述对采样值进行频谱逆整形为:对所确定的频谱整形区域内的 采样值进行频谱逆整形。  Performing spectral inverse shaping on the sampled value is: performing spectral inverse shaping on the sampled value in the determined spectral shaping region.
23、 根据权利要求 22所述的方法, 其特征在于, 所述频语预整 形的步骤包括:  The method according to claim 22, wherein the step of frequency pre-forming comprises:
在所确定的频谱整形区域内的采样值中, 标记采样值的峰值; 利用标记的峰值中的一个峰值, 计算用于频语预整形的参考值; 利用参考值, 计算各标记峰值的放大因子;  Determining a peak value of the sampled value in the determined sampled value in the spectral shaping region; calculating a reference value for frequency pre-shaping using one of the peak values of the marker; calculating a magnification factor of each labeled peak using the reference value ;
利用所计算出的放大因子, 对频 i普进行预整形。  The frequency is pre-shaped using the calculated amplification factor.
24、 根据权利要求 23所述的方法, 其特征在于,  24. The method of claim 23, wherein
所述标记采样值的峰值为: 在频谱整形区域中, 选择一个或一个 以上局部区域, 并在各局部区域中,选择幅值最大的采样值作为对应 局部区域的峰值。  The peak value of the marked sample value is: In the spectrum shaping area, one or more local areas are selected, and in each local area, the sample value with the largest amplitude is selected as the peak value corresponding to the local area.
25、 根据权利要求 24所述的方一法, 其特征在于,  25. The method according to claim 24, wherein:
所述对频谱进行预整形为: 除了用于计算参考值的峰值之外, 对 剩余的峰值所在的局部区域, 利用相应峰值的放大因子进行预整形。  The pre-shaping of the spectrum is: in addition to the peak used to calculate the reference value, the local region where the remaining peak is located is pre-shaped by the amplification factor of the corresponding peak.
26、 根据权利要求 25所述的方法, 其特征在于,  26. The method of claim 25, wherein
所述预整形为: 利用放大因子对峰值进行放大, 或者, 利用放大 因子对峰值及其该峰值所在的局部区域内的采样值进行放大。  The pre-shaping is: amplifying the peak by using an amplification factor, or amplifying the peak value and a sample value in a local area where the peak is located by using an amplification factor.
27、 根据权利要求 23所述的方法, 其特征在于,  27. The method of claim 23, wherein
所述计算参考值为: 在所标记的峰值中, 选择最大峰值, 并利用 该最大峰值得到参考值。 The calculated reference value is: among the marked peaks, the maximum peak is selected and utilized This maximum peak gets the reference value.
28、 居权利要求 27所述的方法, 其特征在于, 所述参考值为: 最大峰值的幅值, 或最大峰值的临近釆样点的能量, 或最大峰值临近 釆样点的平均能量。  28. The method of claim 27, wherein the reference value is: a magnitude of a maximum peak, or an energy of a peak near a maximum peak, or an average energy of a peak closest to the sample.
29、 根据权利要求 23所述的方法, 其特征在于,  29. The method of claim 23, wherein
所述峰值的放大因子为:参考值与该峰值的比值的第一参数幂的 第二参数倍, 其中, 该第一参数为大于零且小于 1的数, 该第二参数 为任意数。  The amplification factor of the peak is a second parameter multiple of the first parameter power of the ratio of the reference value to the peak value, wherein the first parameter is a number greater than zero and less than 1, and the second parameter is an arbitrary number.
30、 根据权利要求 22所述的方法, 其特征在于, 所述频谱逆整 形的步骤包括:  30. The method of claim 22, wherein the step of spectral inverse shaping comprises:
在所确定的频谱整形区域内的采样值中, 标记采样值的峰值; 利用标记的峰值中的一个峰值, 计算用于频语逆整形的参考值; 利用参考值, 计算各标记峰值的缩小因子;  Determining the peak value of the sampled value in the determined sampled value in the spectral shaping region; calculating a reference value for frequency inverse modeling using one of the peak values of the marker; calculating a reduction factor for each labeled peak using the reference value ;
利用所计算出的缩小因子, 对频语进行逆整形。  The frequency is inversely shaped using the calculated reduction factor.
31、 根据权利要求 2所述的方法, 其特征在于,  31. The method of claim 2, wherein
所述利用时域的第一采样值和时域的第二采样值得到全局增益 为:所述全局增益使得所述时域的第一采样值和所述时域的第二采样 值乘以所述全局增益之间的均方误差最小。  The utilizing the first sampled value of the time domain and the second sampled value of the time domain to obtain a global gain is: the global gain multiplying the first sampled value of the time domain and the second sampled value of the time domain by The mean square error between the global gains is minimal.
32、 一种在解码中调整量化质量的方法, 对编码端输出的编码流 进行解码得到解码流, 其特征在于, 该方法包括:  32. A method for adjusting quantization quality in decoding, decoding a coded stream output by an encoder to obtain a decoded stream, wherein the method includes:
从解码流中获取量化采样值、两个或两个以上缩放因子的信息以 及全局增益; 利用所述两个或两个以上缩放因子的信息,从所述量化采样值中 去除缩放因子的影响得到采样值后, 乘以全局增益。 Obtaining quantized sample values, information of two or more scaling factors, and global gain from the decoded stream; Using the information of the two or more scaling factors, the effect of the scaling factor is removed from the quantized sample values to obtain a sampled value, which is then multiplied by the global gain.
33、 根据权利要求 32所述的方法, 其特征在于,  33. The method of claim 32, wherein
所述量化釆样值为频域的量化采样值;  The quantized sample value is a quantized sample value in the frequency domain;
所述从所述量化采样值中去除缩放因子的影响得到采样值为:从 所述量化采样值中去除缩放因子的影响得到频域的采样值;  And removing the effect of removing the scaling factor from the quantized sample value to obtain a sampling value: removing a sampling value of the frequency domain by removing the influence of the scaling factor from the quantized sampling value;
从所述量化采样值中去除缩放因子的影响得到采样值后,在乘以 全局增益之前进一步包括: 将频域的采样值转换为时域的采样值。  After removing the influence of the scaling factor from the quantized sample value to obtain the sampled value, before multiplying the global gain, the method further includes: converting the sampled value in the frequency domain into the sampled value in the time domain.
34、 根据权利要求 33所述的方法, 其特征在于,  34. The method of claim 33, wherein
从所述频域的量化采样值中去除缩放因子的影响得到频域的采 样值后, 将频域的采样值转换为时域的采样值之前, 进一步包括: 对 频域的釆样值进行频域逆整形,  After removing the influence of the scaling factor from the quantized sample values in the frequency domain to obtain the sampled value in the frequency domain, before converting the sampled value in the frequency domain to the sampled value in the time domain, the method further includes: performing frequency sampling on the frequency domain Domain inverse shaping,
或者,从所述频域的量化采样值中去除缩放因子的影响得到频域 的采样值之前, 进一步包括: 对所述频域的量化采样值进行频谱逆整 形。  Alternatively, before removing the influence of the scaling factor from the quantized sample values in the frequency domain to obtain the sampled values in the frequency domain, the method further includes: performing spectral inverse shaping on the quantized sample values in the frequency domain.
35、 根据权利要求 32至 34中任一项所述的方法, 其特征在于, 所述从解码流中获取的缩放因子的信息为: 所有缩放因子; 所述从所得到的量化采样值中去除缩放因子的影响为:按照在编 码时划分频域的采样值的方式,将量化采样值划分为相应的两个或两 个以上部分, 并利用各部分的缩放因子, 从相应部分的量化采样值中 去除对应部分的缩放因子的影响。  The method according to any one of claims 32 to 34, wherein the information of the scaling factor obtained from the decoded stream is: all scaling factors; the removing from the obtained quantized sampling values The effect of the scaling factor is: dividing the quantized sample value into two or more corresponding parts according to the method of dividing the sampled values in the frequency domain at the time of encoding, and using the scaling factor of each part, the quantized sample value from the corresponding part Remove the effect of the scaling factor of the corresponding part.
36、 根据权利要求 32至 34中任一项所述的方法, 其特征在于, 所述从解码流中获取的缩放因子的信息为:将一个缩放因子作为 基准缩放因子, 其余缩放因子与该基准缩放因子的比值; The method according to any one of claims 32 to 34, characterized in that The information of the scaling factor obtained from the decoding stream is: using a scaling factor as a reference scaling factor, and a ratio of the remaining scaling factors to the reference scaling factor;
所述从所得到的量化采样值中去除缩放因子的影响为:按照在编 码时划分频域的釆样值的方式 ,将量化釆样值划分为相应的两个或两 个以上部分, 并利用所得到的比值, 从相应部分的量化采样值中去除 对应部分的缩放因子的影响。  The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the frequency values of the frequency domain at the time of encoding, and utilizing The resulting ratio removes the influence of the scaling factor of the corresponding portion from the quantized sample values of the corresponding portion.
37、 根据权利要求 32至 34中任一项所述的方法, 其特征在于, 所述从解码流中获取的缩放因子的信息为:将一个缩放因子作为 基准缩放因子,其余缩放因子与该基准缩放因子的比值以及该基准缩 放因子;  The method according to any one of claims 32 to 34, wherein the information of the scaling factor obtained from the decoded stream is: using a scaling factor as a reference scaling factor, and remaining scaling factors and the reference The ratio of the scaling factor and the reference scaling factor;
所述从所得到的量化采样值中去除缩放因子的影响为:按照在编 码时划分频域的采样值的方式,将量化采样值划分为相应的两个或两 个以上部分, 并利用基准缩放因子和比值计算得到各部分的缩放因 子, 利用各部分的缩放因子, 从相应部分的量化釆样值中去除对应部 分的缩放因子的影响。  The effect of removing the scaling factor from the obtained quantized sample values is: dividing the quantized sample values into corresponding two or more parts according to the manner of dividing the sample values in the frequency domain at the time of encoding, and using the reference scaling The factor and the ratio are calculated to obtain the scaling factor of each part, and the scaling factor of each part is used to remove the influence of the scaling factor of the corresponding part from the quantized sample value of the corresponding part.
38、 根据权利要求 34所述的方法, 其特征在于, 所述频谱逆整 形的步骤包括:  38. The method of claim 34, wherein the step of spectral inverse shaping comprises:
在编码时所确定的频谱整形区域内的采样值中,标记采样值的峰 值;  Marking the peak value of the sampled value in the sampled value in the spectrum shaping area determined at the time of encoding;
利用标记的峰值中的一个峰值, 计算用于频语逆整形的参考值; 利用参考值, 计算各标记峰值的缩小因子;  Using a peak value of the marked peak, calculating a reference value for frequency inverse transformation; using a reference value, calculating a reduction factor of each marker peak;
利用所计算出的缩小因子, 对频谱进行逆整形。 The spectrum is inverse shaped using the calculated reduction factor.
39、 一种在编码中调整量化质量的装置, 其特征在于, 该装置包 括: 多缩放因子控制单元, 量化单元, 增益平衡单元, 全局增益计算 单元; 39. An apparatus for adjusting quantization quality in coding, the apparatus comprising: a multi-scaling factor control unit, a quantization unit, a gain balancing unit, and a global gain calculation unit;
其中, 所述多缩放因子控制单元用于接收第一釆样值, 对第一采 样值设置两个或两个以上缩放因子,利用缩放因子对第一采样值进行 调整, 将调整后的第一采样值输出给所述量化单元;  The multi-scale factor control unit is configured to receive a first sample value, set two or more scaling factors for the first sample value, and adjust the first sample value by using a scaling factor, and adjust the first sample value. The sampled value is output to the quantization unit;
所述量化单元用于对所接收的第一采样值进行量化得到量化采 样值并输出给所述增益平衡单元;  The quantization unit is configured to quantize the received first sample value to obtain a quantized sample value and output the result to the gain balance unit;
所述增益平衡单元用于接收量化采样值,从量化采样值中去除缩 放因子的影响得到第二采样值, 并输出给所述全局增益计算单元; 全局增益计算单元用于接收第一采样值和第二采样值,并利用第 一采样值和第二采样值得到全局增益。  The gain balancing unit is configured to receive the quantized sample value, remove the influence of the scaling factor from the quantized sample value to obtain a second sample value, and output the same to the global gain calculation unit; the global gain calculation unit is configured to receive the first sample value and The second sampled value is used to obtain a global gain using the first sampled value and the second sampled value.
40、 根据权利要求 39所述的装置, 其特征在于, 该装置进一步 包括: 时频变换单元和逆时频变换单元;  40. The apparatus according to claim 39, wherein the apparatus further comprises: a time-frequency transform unit and an inverse time-frequency transform unit;
所述时频变换单元用于接收第一采样值,并对第一采样值进行时 频变换后, 输出给所述多缩放因子控制单元;  The time-frequency transform unit is configured to receive the first sample value, and perform time-frequency transform on the first sample value, and output the result to the multi-scale factor control unit;
所述逆时频变换单元用于从所述增益平衡单元中接收第二采样 值, 并对第二采样值进行逆时频变换后, 输出给所述全局增益计算单 元。  The inverse time-frequency transform unit is configured to receive a second sample value from the gain balance unit, and perform inverse time-frequency transform on the second sample value, and output the result to the global gain calculation unit.
41、 根据权利要求 40所述的装置, 其特征在于, 该装置进一步 包括: 频谱预整形单元和频谱逆整形单元;  The device according to claim 40, further comprising: a spectrum pre-shaping unit and a spectrum inverse shaping unit;
所述频谱预整形单元用于接收所述时频变换单元输出的第一采 样值,对该第一采样值进行频语预整形后输出给所述多缩放因子控制 单元;所述频谱逆整形单元用于接收所述增益平衡单元输出的第二采 样值, 对该第二釆样值进行频谱逆整形后输出给所述逆时频变换单 元; The spectrum pre-shaping unit is configured to receive the first output of the time-frequency transform unit output a sample, the frequency is pre-shaped and output to the multi-scale factor control unit; the spectrum inverse shaping unit is configured to receive a second sample value output by the gain balancing unit, and the second sample Performing spectral inverse shaping on the sample value and outputting the result to the inverse time-frequency transform unit;
或者,  Or,
所述频谱预整形单元用于接收所述多缩放因子控制单元输出的 第一采样值, 对该第一采样值进行频谱预整形后输出给所述量化单 元; 所述频语逆整形单元用于接收所述量化单元输出的量化采样值, 对该量化采样值进行频谱逆整形后输出给所述增益平衡单元。  The spectrum pre-shaping unit is configured to receive a first sample value output by the multi-scale factor control unit, perform spectrum pre-shaping on the first sample value, and output the result to the quantization unit; Receiving the quantized sample value output by the quantization unit, performing spectral inverse shaping on the quantized sample value, and outputting the same to the gain balancing unit.
42、 根据权利要求 39至 41中任一项所述的装置, 其特征在于, 所述多缩放因子控制单元包括: 缩放因子设置单元和采样值调整单 元;  The apparatus according to any one of claims 39 to 41, wherein the multi-scale factor control unit comprises: a scaling factor setting unit and a sample value adjusting unit;
所述缩放因子设置单元用于对第一采样值设置两个或两个以上 缩放因子, 并将所设置的缩放因子输出给所述采样值调整单元; 所述采样值调整单元用于接收缩放因子,并利用缩放因子对第一 采样值进行调整。  The scaling factor setting unit is configured to set two or more scaling factors for the first sampling value, and output the set scaling factor to the sampling value adjusting unit; the sampling value adjusting unit is configured to receive a scaling factor And use the scaling factor to adjust the first sampled value.
43、 根据权利要求 42所述的装置, 其特征在于, 所述缩放因子 设置单元包括: 基准值设置单元、 缩放因子调整单元、 消耗比特数估 计单元、 感知失真计算单元;  The apparatus according to claim 42, wherein the scaling factor setting unit comprises: a reference value setting unit, a scaling factor adjusting unit, a consumption bit number estimating unit, and a perceptual distortion calculating unit;
所述基准值设置单元用于设置缩放因子的基准值,并输出给所述 缩放因子调整单元;  The reference value setting unit is configured to set a reference value of the scaling factor, and output the result to the scaling factor adjustment unit;
所述缩放因子调整单元用于根据基准值调整缩放因子,并输出给 所述消耗比特数估计单元和感知失真计算单元; The scaling factor adjustment unit is configured to adjust a scaling factor according to a reference value, and output the The consumption bit number estimation unit and the perceptual distortion calculation unit;
所述消耗比特数估计单元用于根据缩放因子, 估计消耗比特数, 并判断消耗比特数是否小于编码所允许的总比特数,将判断结果发送 给所述缩放因子调整单元;  The consumption bit number estimating unit is configured to estimate the number of consumed bits according to the scaling factor, and determine whether the number of consumed bits is smaller than the total number of bits allowed by the encoding, and send the determination result to the scaling factor adjusting unit;
所述感知失真计算单元用于根据缩放因子, 计算感知失真, 并判 断感知失真是否在无法感知的范围内,将判断结果发送给所述缩放因 子调整单元。  The perceptual distortion calculation unit is configured to calculate the perceptual distortion according to the scaling factor, and determine whether the perceptual distortion is within an incapable range, and transmit the determination result to the scaling factor adjustment unit.
44、 根据权利要求 41所述的装置, 其特征在于, 所述频语预整 形单元包括: 峰值标记单元、 参考值计算单元、 放大因子计算单元、 预整形单元;  44. The apparatus according to claim 41, wherein the frequency pre-shaping unit comprises: a peak marking unit, a reference value calculating unit, an amplification factor calculating unit, and a pre-shaping unit;
其中, 所述峰值标记单元用于接收第一釆样值, 并在频谱整形区 域内的第一采样值中, 标记峰值, 输出给所述参考值计算单元;  The peak labeling unit is configured to receive the first sample value, and mark a peak value in the first sampled value in the spectrum shaping area, and output the signal to the reference value calculating unit;
所述参考值计算单元用于利用峰值计算用于频谱预整形的参考 值, 输出给所 文大因子计算单元;  The reference value calculation unit is configured to calculate a reference value for spectrum pre-shaping by using a peak value, and output the value to the text factor calculation unit;
所述放大因子计算单元用于利用参考值,计算各标记峰值的放大 因子, 输出给所述预整形单元;  The amplification factor calculation unit is configured to calculate an amplification factor of each marker peak value by using a reference value, and output the signal to the pre-shaping unit;
所述预整形单元用于利用所述放大因子 , 对频谱进行预整形。 The pre-shaping unit is configured to pre-shape the spectrum by using the amplification factor.
45、 根据权利要求 41所述的装置, 其特征在于, 所述频语逆整 形单元包括: 峰值标记单元、 参考值计算单元、 缩小因子计算单元、 逆整形单元; The device according to claim 41, wherein the frequency inverse unit comprises: a peak marker unit, a reference value calculation unit, a reduction factor calculation unit, and an inverse shaping unit;
其中, 所述峰值标记单元用于接收采样值, 并在频谱整形区域内 的采样值中, 标记峰值, 输出给所述参考值计算单元; 所述参考值计算单元用于利用峰值计算用于频谱逆整形的参考 值, 输出给所述缩小因子计算单元; The peak marking unit is configured to receive a sampling value, and mark a peak value in the sampling value in the spectrum shaping region, and output the signal to the reference value calculating unit; The reference value calculation unit is configured to calculate a reference value for spectral inverse shaping using a peak value, and output the result to the reduction factor calculation unit;
所述缩小因子计算单元用于利用参考值,计算各标记峰值的缩小 因子, 输出给所述逆整形单元;  The reduction factor calculation unit is configured to calculate a reduction factor of each mark peak value by using a reference value, and output the result to the inverse shaping unit;
所述逆整形单元用于利用所述缩小因子, 对频谱进行逆整形。  The inverse shaping unit is configured to inversely shape the spectrum by using the reduction factor.
46、 一种在解码中调整量化质量的装置, 其特征在于, 该装置包 括: 增益平衡单元和全局增益平衡单元;  46. An apparatus for adjusting quantization quality in decoding, the apparatus comprising: a gain balancing unit and a global gain balancing unit;
其中, 所述增益平衡单元用于接收量化釆样值和缩放因子, 并利 用所接收的缩放因子,从量化采样值中去除缩放因子的影响得到采样 值, 并输出给所述全局增益平衡单元;  The gain balancing unit is configured to receive the quantized sample value and the scaling factor, and use the received scaling factor to remove the influence of the scaling factor from the quantized sample value to obtain a sampled value, and output the sampled value to the global gain balancing unit;
所述全局增益平衡单元用于接收全局增益和采样值,并对采样值 乘以全局增益后输出。  The global gain balancing unit is configured to receive the global gain and the sampled value, and multiply the sampled value by the global gain and output.
47、 根据权利要求 46所述的装置, 其特征在于, 该装置进一步 包括: 逆时频变换单元;  47. The apparatus according to claim 46, wherein the apparatus further comprises: an inverse time-frequency transform unit;
所述逆时频变换单元用于从所述增益平衡单元中接收采样值,并 对采样值进行逆时频变换后, 输出给所述全局增益平衡单元。  The inverse time-frequency transform unit is configured to receive a sampled value from the gain balance unit, and perform inverse time-frequency transform on the sampled value, and output the sampled value to the global gain balance unit.
48、 根据权利要求 47所述的装置, 其特征在于, 该装置进一步 包括: 频谱逆整形单元;  The device according to claim 47, further comprising: a spectrum inverse shaping unit;
所述频语逆整形单元用于接收所述增益平衡单元输出的采样值, 对该采样值进行频 i普逆整形后输出给所述逆时频变换单元;  The frequency inverse transforming unit is configured to receive the sampled value output by the gain balancing unit, perform inverse frequency shaping on the sampled value, and output the sampled value to the inverse time-frequency transform unit;
或者,  Or,
所述频谱逆整形单元用于接收量化采样值,对该量化采样值进行 频语逆整形后输出给所述增益平衡单元。 The spectrum inverse shaping unit is configured to receive a quantized sample value, and perform the quantized sample value The frequency is inversely shaped and output to the gain balancing unit.
49、 根据权利要求 48所述的装置, 其特征在于, 所述频谱逆整 形单元包括: 峰值标记单元、 参考值计算单元、 缩小因子计算单元、 逆整形单元;  49. The apparatus according to claim 48, wherein the spectrum inverse shaping unit comprises: a peak marking unit, a reference value calculating unit, a reduction factor calculating unit, and an inverse shaping unit;
其中, 所述峰值标记单元用于接收采样值, 并在频谱整形区域内 的釆样值中, 标记峰值, 输出给所述参考值计算单元;  The peak labeling unit is configured to receive a sampled value, and mark a peak value in the sampled value in the spectral shaping area, and output the peak value to the reference value calculating unit;
所述参考值计算单元用于利用峰值计算用于频谱逆整形的参考 值, 输出给所述缩小因子计算单元;  The reference value calculation unit is configured to calculate a reference value for spectrum inverse shaping using a peak value, and output the result to the reduction factor calculation unit;
所述缩小因子计算单元用于利用参考值,计算各标记峰值的缩小 因子, 输出给所述逆整形单元;  The reduction factor calculation unit is configured to calculate a reduction factor of each mark peak value by using a reference value, and output the result to the inverse shaping unit;
所述逆整形单元用于利用所述缩小因子, 对频谱进行逆整形。  The inverse shaping unit is configured to inversely shape the spectrum by using the reduction factor.
PCT/CN2007/003799 2006-12-01 2007-12-26 A method and an apparatus for adjusting quantization quality in encoder and decoder WO2008064577A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07855801A EP2104095A4 (en) 2006-12-01 2007-12-26 A method and an apparatus for adjusting quantization quality in encoder and decoder

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN 200610164330 CN101192410B (en) 2006-12-01 2006-12-01 Method and device for regulating quantization quality in decoding and encoding
CN200610164330.X 2006-12-01

Publications (2)

Publication Number Publication Date
WO2008064577A1 true WO2008064577A1 (en) 2008-06-05
WO2008064577A8 WO2008064577A8 (en) 2009-05-07

Family

ID=39467436

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/003799 WO2008064577A1 (en) 2006-12-01 2007-12-26 A method and an apparatus for adjusting quantization quality in encoder and decoder

Country Status (3)

Country Link
EP (1) EP2104095A4 (en)
CN (1) CN101192410B (en)
WO (1) WO2008064577A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609674B (en) * 2008-06-20 2011-12-28 华为技术有限公司 Method, device and system for coding and decoding
CN101964690B (en) * 2009-07-22 2012-07-04 联芯科技有限公司 HARQ merged decoding method, device and system
JP5316896B2 (en) * 2010-03-17 2013-10-16 ソニー株式会社 Encoding device, encoding method, decoding device, decoding method, and program
CN102821069B (en) * 2011-06-07 2018-06-08 中兴通讯股份有限公司 Base station and uplink data compression method on base station side
CN103354091B (en) * 2013-06-19 2015-09-30 北京百度网讯科技有限公司 Based on audio feature extraction methods and the device of frequency domain conversion
CN105721879B (en) * 2016-01-26 2018-08-31 北京空间飞行器总体设计部 A kind of area-of-interest transmission method under survey of deep space image segmentation protection
CN111429944B (en) * 2020-04-17 2023-06-02 北京百瑞互联技术有限公司 Development test optimization method and system for codec

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0497413A1 (en) * 1991-02-01 1992-08-05 Koninklijke Philips Electronics N.V. Subband coding system and a transmitter comprising the coding system
US5388181A (en) * 1990-05-29 1995-02-07 Anderson; David J. Digital audio compression system
WO1996014695A1 (en) 1994-11-04 1996-05-17 Philips Electronics N.V. Encoding and decoding of a wideband digital information signal
CN1241336A (en) * 1997-07-29 2000-01-12 皇家菲利浦电子有限公司 Variable bitrate video coding method and corresponding video coder
JP2000244325A (en) * 1999-02-24 2000-09-08 Alpine Electronics Inc Method for decoding mpeg audio
CN1318904A (en) * 2001-03-13 2001-10-24 北京阜国数字技术有限公司 Practical sound coder based on wavelet conversion
US20040143431A1 (en) 2003-01-20 2004-07-22 Mediatek Inc. Method for determining quantization parameters
US20050254586A1 (en) 2004-05-12 2005-11-17 Samsung Electronics Co., Ltd. Method of and apparatus for encoding/decoding digital signal using linear quantization by sections
US20060074693A1 (en) 2003-06-30 2006-04-06 Hiroaki Yamashita Audio coding device with fast algorithm for determining quantization step sizes based on psycho-acoustic model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19638997B4 (en) * 1995-09-22 2009-12-10 Samsung Electronics Co., Ltd., Suwon Digital audio coding method and digital audio coding device
CA2252170A1 (en) * 1998-10-27 2000-04-27 Bruno Bessette A method and device for high quality coding of wideband speech and audio signals
US6912496B1 (en) * 1999-10-26 2005-06-28 Silicon Automation Systems Preprocessing modules for quality enhancement of MBE coders and decoders for signals having transmission path characteristics

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5388181A (en) * 1990-05-29 1995-02-07 Anderson; David J. Digital audio compression system
EP0497413A1 (en) * 1991-02-01 1992-08-05 Koninklijke Philips Electronics N.V. Subband coding system and a transmitter comprising the coding system
US5621855A (en) * 1991-02-01 1997-04-15 U.S. Philips Corporation Subband coding of a digital signal in a stereo intensity mode
WO1996014695A1 (en) 1994-11-04 1996-05-17 Philips Electronics N.V. Encoding and decoding of a wideband digital information signal
CN1241336A (en) * 1997-07-29 2000-01-12 皇家菲利浦电子有限公司 Variable bitrate video coding method and corresponding video coder
JP2000244325A (en) * 1999-02-24 2000-09-08 Alpine Electronics Inc Method for decoding mpeg audio
CN1318904A (en) * 2001-03-13 2001-10-24 北京阜国数字技术有限公司 Practical sound coder based on wavelet conversion
US20040143431A1 (en) 2003-01-20 2004-07-22 Mediatek Inc. Method for determining quantization parameters
US20060074693A1 (en) 2003-06-30 2006-04-06 Hiroaki Yamashita Audio coding device with fast algorithm for determining quantization step sizes based on psycho-acoustic model
US20050254586A1 (en) 2004-05-12 2005-11-17 Samsung Electronics Co., Ltd. Method of and apparatus for encoding/decoding digital signal using linear quantization by sections

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2104095A4 *

Also Published As

Publication number Publication date
CN101192410A (en) 2008-06-04
EP2104095A1 (en) 2009-09-23
WO2008064577A8 (en) 2009-05-07
EP2104095A4 (en) 2012-07-18
CN101192410B (en) 2010-05-19

Similar Documents

Publication Publication Date Title
JP5539203B2 (en) Improved transform coding of speech and audio signals
JP4977471B2 (en) Encoding apparatus and encoding method
KR101221918B1 (en) A method and an apparatus for processing a signal
TWI601130B (en) Audio encoding apparatus
JP5013863B2 (en) Encoding apparatus, decoding apparatus, communication terminal apparatus, base station apparatus, encoding method, and decoding method
CN1890711B (en) Method for encoding a digital signal into a scalable bitstream, method for decoding a scalable bitstream
US9037454B2 (en) Efficient coding of overcomplete representations of audio using the modulated complex lapped transform (MCLT)
CN103069484B (en) Time/frequency two dimension post-processing
JP2022050609A (en) Audio-acoustic coding device, audio-acoustic decoding device, audio-acoustic coding method, and audio-acoustic decoding method
JP5267362B2 (en) Audio encoding apparatus, audio encoding method, audio encoding computer program, and video transmission apparatus
JP5418930B2 (en) Speech decoding method and speech decoder
US9443534B2 (en) Bandwidth extension system and approach
WO2008064577A1 (en) A method and an apparatus for adjusting quantization quality in encoder and decoder
JP6368029B2 (en) Noise signal processing method, noise signal generation method, encoder, decoder, and encoding and decoding system
WO2005096274A1 (en) An enhanced audio encoding/decoding device and method
JP4548348B2 (en) Speech coding apparatus and speech coding method
JP2009501358A (en) Low bit rate audio signal encoding / decoding method and apparatus
US20080140393A1 (en) Speech coding apparatus and method
RU2530926C2 (en) Rounding noise shaping for integer transform based audio and video encoding and decoding
TW201724087A (en) Apparatus for coding envelope of signal and apparatus for decoding thereof
WO2009109139A1 (en) A super-wideband extending coding and decoding method, coder and super-wideband extending system
JP2006259517A (en) Speech processor and speech processing method
EP1873753A1 (en) Enhanced audio encoding/decoding device and method
WO2010000179A1 (en) A frequency band expanding method, system and apparatus
US20130006644A1 (en) Method and device for spectral band replication, and method and system for audio decoding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07855801

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2007855801

Country of ref document: EP